
Ultimate access to all questions.
A junior data engineer is working with Databricks notebook language interoperability. The goal is to create a view showing all sales from African countries listed in the geo_lookup table. The current database contains only two tables: geo_lookup and sales.
The following code is executed:
%python
countries_af = [x[0] for x in
spark.table("geo_lookup").filter("continent='AF'").select("country").collect()]
%python
countries_af = [x[0] for x in
spark.table("geo_lookup").filter("continent='AF'").select("country").collect()]
%sql
CREATE VIEW sales_af AS
SELECT *
FROM sales
WHERE country IN (countries_af)
AND continent = 'AF'
%sql
CREATE VIEW sales_af AS
SELECT *
FROM sales
WHERE country IN (countries_af)
AND continent = 'AF'
What will be the result of executing these command cells sequentially in an interactive notebook?*
A
Both commands will succeed. Executing SHOW TABLES will show that countries_af and sales_af have been registered as views.
B
Cmd 1 will succeed. Cmd 2 will search all accessible databases for a table or view named countries_af: if this entity exists, Cmd 2 will succeed._
C
Cmd 1 will succeed and Cmd 2 will fail. countries_af will be a Python variable representing a PySpark DataFrame._
D
Cmd 1 will succeed and Cmd 2 will fail. countries_af will be a Python variable containing a list of strings._