Databricks Certified Data Engineer - Professional

Ultimate access to all questions.

Explanation:

The correct answer is C: The Stage's detail screen and the Executor's log files.

Stage's Detail Screen: On the specific Stage ID page in the Spark UI, you can find the Shuffle metrics section. This area includes two specific columns: Shuffle spill (memory) and Shuffle spill (disk). If the 'disk' column contains non-zero values, it confirms that data was serialized and written to disk during that stage.
Executor's Log Files: By navigating to the Executors tab and viewing the stdout/stderr logs for a specific executor, you can search for entries from the UnsafeExternalSorter. Messages such as "Spilling data because number of spilledRecords crossed the threshold" provide definitive proof of spilling at the task level.

Why other options are incorrect:

Query (SQL) detail screen: This view shows high-level physical and logical plan metrics but does not expose the granular task-level shuffle spill counters found in the Stage view.
Driver's log files: Spilling is an executor-side event. The driver does not manage the local shuffle memory of the executors, so these events are not captured in the driver logs.
Executor's detail screen: While the Executors tab provides aggregate metrics, the primary per-task spill counters are most effectively analyzed on the Stage detail page.

Explanation:

The correct answer is C: The Stage's detail screen and the Executor's log files.

Stage's Detail Screen: On the specific Stage ID page in the Spark UI, you can find the Shuffle metrics section. This area includes two specific columns: Shuffle spill (memory) and Shuffle spill (disk). If the 'disk' column contains non-zero values, it confirms that data was serialized and written to disk during that stage.
Executor's Log Files: By navigating to the Executors tab and viewing the stdout/stderr logs for a specific executor, you can search for entries from the UnsafeExternalSorter. Messages such as "Spilling data because number of spilledRecords crossed the threshold" provide definitive proof of spilling at the task level.

Why other options are incorrect:

Query (SQL) detail screen: This view shows high-level physical and logical plan metrics but does not expose the granular task-level shuffle spill counters found in the Stage view.
Driver's log files: Spilling is an executor-side event. The driver does not manage the local shuffle memory of the executors, so these events are not captured in the driver logs.
Executor's detail screen: While the Executors tab provides aggregate metrics, the primary per-task spill counters are most effectively analyzed on the Stage detail page.

Comments (0)

No comments yet.

Real Exam

Last updated: January 6, 2026 at 15:39

The Driver's log files and the Executor's log files.

15.8%

The Stage's detail screen and the Query's (SQL) detail screen.

10.5%

The Stage's detail screen and the Executor's log files.

63.2%

The Executor's detail screen and the Executor's log files.

10.5%