
Answer-first summary for fast verification
Answer: The **Stage's detail screen** and the **Executor's log files**.
The correct answer is **C: The Stage's detail screen and the Executor's log files.** * **Stage's Detail Screen:** On the specific Stage ID page in the Spark UI, you can find the **Shuffle metrics** section. This area includes two specific columns: *Shuffle spill (memory)* and *Shuffle spill (disk)*. If the 'disk' column contains non-zero values, it confirms that data was serialized and written to disk during that stage. * **Executor's Log Files:** By navigating to the **Executors** tab and viewing the stdout/stderr logs for a specific executor, you can search for entries from the `UnsafeExternalSorter`. Messages such as "Spilling data because number of spilledRecords crossed the threshold" provide definitive proof of spilling at the task level. **Why other options are incorrect:** * **Query (SQL) detail screen:** This view shows high-level physical and logical plan metrics but does not expose the granular task-level shuffle spill counters found in the Stage view. * **Driver's log files:** Spilling is an executor-side event. The driver does not manage the local shuffle memory of the executors, so these events are not captured in the driver logs. * **Executor's detail screen:** While the Executors tab provides aggregate metrics, the primary per-task spill counters are most effectively analyzed on the Stage detail page.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
During the execution of wide transformations in Apache Spark, data may 'spill' to disk if the executor's memory is insufficient. Which two locations in the Spark UI provide the primary indicators that partitions are spilling to disk?
A
The Driver's log files and the Executor's log files.
B
The Stage's detail screen and the Query's (SQL) detail screen.
C
The Stage's detail screen and the Executor's log files.
D
The Executor's detail screen and the Executor's log files.