Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

A data analyst has created a Delta table named 'sales' that serves the entire data analysis team. To ensure the data quality, the analyst seeks assistance from the data engineering team to implement a series of validation tests. However, the data engineering team prefers using Python for these tests instead of SQL. Which of the following commands could the data engineering team use to access the 'sales' Delta table in PySpark?

Exam-Like

SELECT * FROM sales

There is no way to share data between PySpark and SQL.

spark.sql("sales")

spark.delta.table("sales")

spark.table("sales")

Explanation:

The correct answer is E. The spark.table() function in PySpark allows you to access tables registered in the catalog, including Delta tables. By specifying the table name ('sales'), the data engineering team can read the Delta table and perform various operations on it using PySpark. Option A (SELECT * FROM sales) is SQL syntax and cannot be directly used in PySpark. Option B is incorrect because PySpark provides the capability to interact with data using both SQL and DataFrame/DataSet APIs. Option C (spark.sql('sales')) is not valid SQL syntax. Option D (spark.delta.table('sales')) does not exist in PySpark; the correct method is spark.table('sales').

Powered ByGPT-5.2

Comments

Loading comments...