
Explanation:
The question asks for code blocks that return a 4-partition DataFrame from an 8-partition DataFrame without inducing a shuffle. Coalesce (option C) is designed to reduce partitions by merging existing ones without shuffling, making it the correct choice. Repartition (options A, B, D) always triggers a shuffle, even when reducing partitions, which does not meet the requirement of avoiding a shuffle. Option E is invalid because it lacks the necessary partition count argument. Therefore, the only correct option is C.
Ultimate access to all questions.
No comments yet.
Which of the following code blocks will consistently produce a new 4-partition DataFrame from an 8-partition DataFrame named storesDF without causing a shuffle operation?
A
storesDF.repartition(4, "sqft")
B
storesDF.repartition()
C
storesDF.coalesce(4)
D
storesDF.repartition(4)
E
storesDF.coalesce