Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
When dealing with data skew in a Spark job that processes data from Azure Blob Storage, which of the following techniques is least effective for managing skew?
A
Applying a custom partitioner that considers data distribution
B
Employing salting techniques before shuffling
C
Using the coalesce function to reduce the number of partitions
D
Increasing the spark.sql.shuffle.partitions parameter