Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
To minimize read and write latency when optimizing a Spark job for processing a large dataset stored in Azure Data Lake Storage Gen2, which configuration should be prioritized?
A
Adjusting spark.sql.shuffle.partitions to match the number of cores
B
Configuring spark.hadoop.fs.azure integration properties correctly
C
Increasing the spark.executor.instances
D
Tuning spark.databricks.io.cache.enabled to true