Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.

Explanation:

Both cache() and persist() are lazily evaluated in Spark. This means an action such as df.count() or df.show() must be performed to trigger the caching process and make the DataFrame visible in the Spark UI's Storage Tab. The absence of the DataFrame in the UI immediately after persist() is due to this lazy evaluation, not a failure of the command.

Explanation:

Comments (0)

No comments yet.

A data engineering team is optimizing a complex pipeline handling trillions of rows per table. They choose to persist some frequently used DataFrames to speed up query processing. After executing the `persist()` command on a DataFrame, a data engineer checks the Spark UI's Storage Tab but finds no information about the persisted DataFrame. What could be the reason?

Real Exam

Last updated: March 17, 2026 at 14:04

DataFrames persisted via persist() do not appear in the Storage Tab; only those persisted with cache() are visible in Spark UI's Storage Tab.

16.2%

The details of the persisted DataFrame are exclusively available in Ganglia metrics.

5.9%

The DataFrame details should appear in the Spark UI's Storage Tab right after the persist() command. Absence indicates the command failed to execute.

11.0%

Because persist() is lazily evaluated, executing an action on the DataFrame is required to see the cached DataFrame in Spark UI.

54.4%