
Answer-first summary for fast verification
Answer: Shuffle
Stage boundaries in Apache Spark are induced by operations that require data shuffling. Shuffles occur during wide transformations where data is redistributed across partitions, necessitating a new stage. Among the options provided, only 'Shuffle' (A) directly causes a stage boundary. Caching (B) does not create a stage boundary; it materializes data but does not inherently involve shuffling. Executor failure (C), job delegation (D), and application failure (E) are related to runtime execution or fault tolerance but do not affect the logical division of stages. Therefore, the correct answer is A.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.