
Explanation:
The correct answer is E. The property spark.sql.adaptive.coalescePartitions.enabled is part of Spark's Adaptive Query Execution (AQE) framework. When enabled (default in Spark 3.0+), it allows Spark to dynamically coalesce shuffle partitions that are smaller than a specified size threshold into larger partitions during a shuffle. This optimizes resource usage and improves performance by avoiding excessive small partitions. Other options like spark.sql.shuffle.partitions (A) set the static number of partitions during shuffles but do not handle dynamic coalescing. Options B, C, and D relate to broadcast joins, skew join handling, and in-memory storage batch sizes, respectively, and are unrelated to partition coalescing.
Ultimate access to all questions.
No comments yet.
Which Spark property configures whether DataFrame partitions below a minimum size threshold are automatically coalesced into larger partitions during a shuffle?
A
spark.sql.shuffle.partitions
B
spark.sql.autoBroadcastJoinThreshold
C
spark.sql.adaptive.skewJoin.enabled
D
spark.sql.inMemoryColumnarStorage.batchSize
E
spark.sql.adaptive.coalescePartitions.enabled