Databricks Certified Data Engineer - Professional

Ultimate access to all questions.

Explanation:

Spilling occurs when Spark is forced to move data from memory to disk during a shuffle or sort. You can identify this in two main places:

Stage Detail Screen: Navigating to the Stages tab and selecting a specific Stage ID will reveal the Shuffle metrics section. If spilling occurs, two specific columns appear:
- Shuffle spill (memory): The size of the data in memory before it was spilled.
- Shuffle spill (disk): The size of the data once serialized and written to disk.
Executor Log Files: By accessing the Executors tab and viewing the stdout or stderr logs, you can find explicit log entries from the UnsafeExternalSorter (e.g., Spilling data because...). These entries provide task-level confirmation that memory limits were exceeded.

Query Detail Screen: While useful for viewing the physical and logical plans, the SQL tab does not provide the specific low-level shuffle spill counters found in the Stage detail view.
Driver Log Files: Spilling is an operation performed by Executors. Consequently, the Driver logs do not record these executor-level memory management events.
Executor Summary Screen: Although the main Executor tab shows aggregate metrics, it lacks the granular, per-stage shuffle spill metrics necessary for detailed debugging.

Explanation:

Spilling occurs when Spark is forced to move data from memory to disk during a shuffle or sort. You can identify this in two main places:

Stage Detail Screen: Navigating to the Stages tab and selecting a specific Stage ID will reveal the Shuffle metrics section. If spilling occurs, two specific columns appear:
- Shuffle spill (memory): The size of the data in memory before it was spilled.
- Shuffle spill (disk): The size of the data once serialized and written to disk.
Executor Log Files: By accessing the Executors tab and viewing the stdout or stderr logs, you can find explicit log entries from the UnsafeExternalSorter (e.g., Spilling data because...). These entries provide task-level confirmation that memory limits were exceeded.

Query Detail Screen: While useful for viewing the physical and logical plans, the SQL tab does not provide the specific low-level shuffle spill counters found in the Stage detail view.
Driver Log Files: Spilling is an operation performed by Executors. Consequently, the Driver logs do not record these executor-level memory management events.
Executor Summary Screen: Although the main Executor tab shows aggregate metrics, it lacks the granular, per-stage shuffle spill metrics necessary for detailed debugging.

Comments (0)

No comments yet.

Real Exam

Last updated: January 6, 2026 at 15:42

The Driver’s log files and the Executor’s log files.

8.3%

The Stage’s detail screen and the SQL Query detail screen.

16.7%

The Stage’s detail screen and the Executor’s log files.

62.5%

The Executor’s detail screen and the Executor’s log files.

12.5%