Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Explanation:

Explanation

The correct answer is A. spark.sql.

spark.sql() is the primary method in PySpark for executing SQL queries on DataFrames and tables.
The query string uses an f-string (F"SELECT ...") which allows Python variable interpolation, so (table_name) will be replaced with the actual value of the Python variable table_name.
The syntax spark.sql("SELECT ... FROM table_name") is the standard way to run SQL queries in PySpark.

B. spark.read: This is used for reading data from various sources (CSV, Parquet, etc.), not for executing SQL queries.
C. spark.execute: This method doesn't exist in PySpark's SparkSession API.
D. spark.query: This method doesn't exist in PySpark's SparkSession API.
E. spark.table: This method returns a DataFrame for a given table name but doesn't execute arbitrary SQL queries.
F. spark.run: This method doesn't exist in PySpark's SparkSession API.

# Assuming table_name is a Python variable
table_name = "sales_data"
result = spark.sql(F"SELECT customer_id, spend FROM {table_name}")

# Assuming table_name is a Python variable
table_name = "sales_data"
result = spark.sql(F"SELECT customer_id, spend FROM {table_name}")

Note: The original code shows (table_name) in parentheses, but in an f-string, it should be {table_name}. The correct f-string syntax would be:

spark.sql(F"SELECT customer_id, spend FROM {table_name}")

spark.sql(F"SELECT customer_id, spend FROM {table_name}")

Explanation:

The correct answer is A. spark.sql.

spark.sql() is the primary method in PySpark for executing SQL queries on DataFrames and tables.
The query string uses an f-string (F"SELECT ...") which allows Python variable interpolation, so (table_name) will be replaced with the actual value of the Python variable table_name.
The syntax spark.sql("SELECT ... FROM table_name") is the standard way to run SQL queries in PySpark.

B. spark.read: This is used for reading data from various sources (CSV, Parquet, etc.), not for executing SQL queries.
C. spark.execute: This method doesn't exist in PySpark's SparkSession API.
D. spark.query: This method doesn't exist in PySpark's SparkSession API.
E. spark.table: This method returns a DataFrame for a given table name but doesn't execute arbitrary SQL queries.
F. spark.run: This method doesn't exist in PySpark's SparkSession API.

# Assuming table_name is a Python variable
table_name = "sales_data"
result = spark.sql(F"SELECT customer_id, spend FROM {table_name}")

# Assuming table_name is a Python variable
table_name = "sales_data"
result = spark.sql(F"SELECT customer_id, spend FROM {table_name}")

Note: The original code shows (table_name) in parentheses, but in an f-string, it should be {table_name}. The correct f-string syntax would be:

spark.sql(F"SELECT customer_id, spend FROM {table_name}")

spark.sql(F"SELECT customer_id, spend FROM {table_name}")

No comments yet.

Real Exam

Community

KKeng

Last updated: January 13, 2026 at 09:03

spark.sql

81.5%

spark.read

5.4%

spark.execute

2.2%

spark.query

3.3%

spark.table

4.3%