
Ultimate access to all questions.
A data engineer has a Python variable table_name that they would like to use in a SQL query. They want to construct a Python code block that will run the query using table_name. They have the following incomplete code block:
____(f"SELECT customer_id, spend FROM {table_name}")
____(f"SELECT customer_id, spend FROM {table_name}")
What can be used to fill in the blank to successfully complete the task?
A
spark.delta.sql
B
spark.sql
C
spark.table
D
dbutils.sql
Explanation:
The correct answer is B. spark.sql.
spark.sql() is the standard method in PySpark for executing SQL queries in Databricks.spark.sql(f"SELECT customer_id, spend FROM {table_name}") will properly execute the SQL query with the table name variable substituted.spark.delta.sql: This method doesn't exist in PySpark. Delta Lake operations are typically done through spark.sql() or specific DeltaTable APIs.spark.table: This method is used to create a DataFrame from a table name (e.g., spark.table(table_name)), not to execute SQL queries.dbutils.sql: dbutils is a Databricks utility module, but dbutils.sql doesn't exist. SQL execution is handled through spark.sql().spark.sql()