Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

Question 23

A data engineer has ingested data from an external source into a PySpark DataFrame raw_df. They need to briefly make this data available in SQL for a data analyst to perform a quality assurance check on the data.

Which of the following commands should the data engineer run to make this data available in SQL for only the remainder of the Spark session?

Real Exam

Community

LLeetQuiz

raw_df.createOrReplaceTempView("raw_df")

raw_df.createTable("raw_df")

raw_df.write.save("raw_df")

raw_df.saveAsTable("raw_df")

Explanation:

Explanation

The correct answer is A because:

createOrReplaceTempView("raw_df") creates a temporary view that is only available for the duration of the current Spark session, which perfectly matches the requirement to "briefly make this data available in SQL for only the remainder of the Spark session"
Temporary views are automatically dropped when the Spark session ends
This allows data analysts to query the data using SQL syntax like SELECT * FROM raw_df

Why the other options are incorrect:

B createTable("raw_df"): This method doesn't exist in PySpark DataFrame API
C write.save("raw_df"): This saves the DataFrame to a file system location but doesn't make it available as a SQL table/view
D saveAsTable("raw_df"): This creates a permanent table in the Hive metastore that persists beyond the current Spark session, which contradicts the requirement for temporary availability

The key distinction is that temporary views (createOrReplaceTempView) are session-scoped while tables created with saveAsTable are persistent and survive session restarts.

Powered ByGPT-5.2

Comments

Loading comments...