Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
Your Spark jobs on Databricks are running longer and becoming too costly, with suspicions of inefficient resource use and data skew. What optimization strategy would you implement?
A
Convert all your data to Delta format, assuming it will automatically optimize all operations.
B
Schedule jobs to run during off-peak hours to benefit from reduced cost, without changing job configurations.
C
Increase the number of worker nodes in your Databricks cluster to reduce job completion time.
D
Analyze and refactor your Spark jobs to better handle data skew, such as through salting techniques, and fine-tune the Spark configurations for optimal resource utilization.