
Answer-first summary for fast verification
Answer: Command 1 succeeds, but Command 2 fails because `countries_af` is a Python variable containing a list of strings, which is not accessible within the SQL REPL.
In Databricks notebooks, different language REPLs (Python, SQL, Scala, R) do not share local variables directly. 1. **Command 1** is a valid Python statement. The `.collect()` method returns a list of Row objects to the driver, and the list comprehension extracts the country names into a standard Python list named `countries_af`. 2. **Command 2** is a SQL cell. The SQL engine has no visibility into variables defined in the Python REPL. It expects `countries_af` to be a table, view, or common table expression (CTE) registered in the catalog. Because the Python list was never registered as a temporary view (e.g., using `spark.createDataFrame(...).createOrReplaceTempView("countries_af")`), the SQL parser cannot resolve the identifier, causing Command 2 to fail. Command 1 remains successful as it is syntactically correct and operates within the Python environment.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data engineer is working in a Databricks notebook and attempts to create a view named sales_af using data from African countries. The notebook utilizes language interoperability across cells:
Command 1 (Python):
countries_af = [x[0] for x in spark.table("geo_lookup").filter("continent='AF'").select("country").collect()]
countries_af = [x[0] for x in spark.table("geo_lookup").filter("continent='AF'").select("country").collect()]
Command 2 (SQL):
CREATE VIEW sales_af AS
SELECT * FROM sales
WHERE city IN countries_af AND CONTINENT = "AF"
CREATE VIEW sales_af AS
SELECT * FROM sales
WHERE city IN countries_af AND CONTINENT = "AF"
Assuming only the geo_lookup and sales tables exist before execution, what will occur when these commands are executed sequentially?
A
Both commands will execute successfully, resulting in the creation of both the countries_af and sales_af views.
B
Both commands fail; no new variables, tables, or views are created in the notebook session.
C
Command 1 succeeds. Command 2 searches all accessible databases for a countries_af table and succeeds only if a match is found externally.
D
Command 1 succeeds, but Command 2 fails because countries_af is a Python variable containing a list of strings, which is not accessible within the SQL REPL.
E
Command 1 succeeds, but Command 2 fails because countries_af is a Python variable holding a PySpark DataFrame object.