
Answer-first summary for fast verification
Answer: spark.sql.autoBroadcastJoinThreshold
The correct answer is B. The `spark.sql.autoBroadcastJoinThreshold` property determines the maximum size (in bytes) of a DataFrame that Spark will automatically broadcast to all worker nodes during a join operation. If a DataFrame's size is below this threshold, Spark optimizes the join by broadcasting it, avoiding shuffles. Other options are unrelated: A sets broadcast timeout, C controls shuffle partitions, D configures columnar batch size, and E relates to adaptive query execution.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Which Spark property controls the automatic broadcasting of DataFrames that fall below a specified size threshold during runtime?
A
spark.sql.broadcastTimeout
B
spark.sql.autoBroadcastJoinThreshold
C
spark.sql.shuffle.partitions
D
spark.sql.inMemoryColumnarStorage.batchSize
E
spark.sql.adaptive.localShuffleReader.enabled
No comments yet.