Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
When facing performance issues in a Spark job due to skewed data in Delta Lake, which advanced technique can effectively reduce the impact of data skew on query performance?
A
Utilizing the SALTING technique by introducing a random prefix to the join keys.
B
Applying broadcast join as a solution for automatically managing skewed datasets.
C
Executing repartition(1) to merge all data into a single partition.
D
Using coalesce() without shuffling to decrease the number of partitions.