
Ultimate access to all questions.
A data engineer has a Python variable table_name that they would like to use in a SQL query. They want to construct a Python code block that will run the query using table_name. They have the following incomplete code block:
(f"SELECT customer_id, spend FROM {table_name}")
(f"SELECT customer_id, spend FROM {table_name}")
Which of the following can be used to fill in the blank to successfully complete the task?
A
spark.delta.sql
B
spark.delta.table
C
spark.table
D
dbutils.sql
E
spark.sql
Explanation:
The correct answer is spark.sql because:
spark.sql() is the primary method in PySpark for executing SQL queries in Databricks.table_name variable interpolated.spark.sql(f"SELECT customer_id, spend FROM {table_name}")Why other options are incorrect:
dbutils doesn't have a .sql method; SQL execution is done through spark.sql().Key Concept: In Databricks, spark.sql() is the standard way to execute SQL queries from Python code, and it supports string interpolation using f-strings or other string formatting methods.