
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
Question 20
A junior data engineer has ingested a JSON file into a table raw_table with the following schema:
cart_id STRING,
items ARRAY<item_id:STRING>
cart_id STRING,
items ARRAY<item_id:STRING>
The junior data engineer would like to unnest the items column in raw_table to result in a new table with the following schema:
cart_id STRING,
item_id STRING
cart_id STRING,
item_id STRING
Which of the following commands should the junior data engineer run to complete this task?
A
SELECT cart_id, filter(items) AS item_id FROM raw_table;
B
SELECT cart_id, flatten(items) AS item_id FROM raw_table;
C
SELECT cart_id, reduce(items) AS item_id FROM raw_table;
D
SELECT cart_id, explode(items) AS item_id FROM raw_table;
E
SELECT cart_id, slice(items) AS item_id FROM raw_table;
Explanation:
The correct answer is D because the EXPLODE function is specifically designed to unnest array columns in Spark SQL.
cart_id = "cart1" and items = ["itemA", "itemB", "itemC"], EXPLODE(items) would generate:
("cart1", "itemA")("cart1", "itemB")("cart1", "itemC")EXPLODE is the standard Spark SQL function for converting array elements into individual rows, which is exactly what's needed for this data transformation task.