
Ultimate access to all questions.
A data engineer has a Python variable table_name that they would like to use in a SQL query. They want to construct a Python code block that will run the query using table_name. They have the following incomplete code block:
____(f"SELECT customer_id, spend FROM {table_name}")
____(f"SELECT customer_id, spend FROM {table_name}")
Which of the following can be used to fill in the blank to successfully complete the task?
A
spark.delta.sql
B
spark.delta.table
C
spark.table
D
dbutils.sql
E
spark.sql
Explanation:
The correct answer is spark.sql. In Databricks, the spark.sql() method is used to execute SQL queries from Python code. This method takes a SQL query string as an argument and returns a DataFrame. The other options are incorrect:
spark.delta.sql does not exist as a valid methodspark.delta.table is used to access Delta tables but not for executing SQL queriesspark.table is used to read a table into a DataFrame, not for executing arbitrary SQL queriesdbutils.sql does not exist; dbutils has utilities but not SQL execution methodsThe code block would be completed as: spark.sql(f"SELECT customer_id, spend FROM {table_name}")