Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
Which technique is most effective for minimizing shuffle and optimizing performance in a multi-table join operation with tables of varying sizes?
A
Enforcing a uniform repartition across all tables before joining.
B
Utilizing the sortMergeJoin explicitly in all join operations.
C
Applying broadcast hints selectively based on table sizes and existing statistics.
D
Defaulting to crossJoin for all operations, assuming Spark will optimize under the hood.