Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

A data analyst has created a Delta table sales that is used by the entire data analysis team. They want help from the data engineering team to implement a series of tests to ensure the data is clean. However, the data engineering team uses Python for its tests rather than SQL.

Which command could the data engineering team use to access sales in PySpark?

Real Exam

Community

KKeng

Last updated: January 13, 2026 at 09:15

SELECT * FROM sales

spark.table("sales")

spark.sql("sales")

spark.delta.table("sales")

Explanation:

Explanation

The correct answer is B. spark.table("sales").

Here's why:

spark.table("sales") is the standard PySpark method to access a registered table in the Spark session. This method returns a DataFrame representing the table, which can then be used for data validation, testing, and analysis in Python.
Option A (SELECT * FROM sales) is incorrect because this is SQL syntax, not Python/PySpark syntax. While you could use spark.sql("SELECT * FROM sales"), the option shows only the SQL part without the PySpark wrapper.
Option C (spark.sql("sales")) is incorrect because spark.sql() expects a complete SQL query string, not just a table name. This would result in a syntax error.
Option D (spark.delta.table("sales")) is incorrect because while Delta tables can be accessed this way, it's not the standard method. The spark.table() method works for both Delta and non-Delta tables, making it more versatile and the recommended approach.

Key Points:

spark.table("table_name") is the standard PySpark API for accessing registered tables
This returns a DataFrame that can be used for data quality testing in Python
The method works for Delta tables as well as other table types
Once you have the DataFrame, you can apply various data quality checks using PySpark functions

Powered ByGPT-5.2

Comments

Loading comments...