
Answer-first summary for fast verification
Answer: `SELECT cart_id, explode(items) AS item_id FROM raw_table;`
## Explanation The correct answer is **D** because the `EXPLODE` function is specifically designed to unnest array columns in Spark SQL. ### Why EXPLODE is the correct choice: - **EXPLODE** takes an array column and creates a new row for each element in the array - For a row with `cart_id = "cart1"` and `items = ["itemA", "itemB", "itemC"]`, `EXPLODE(items)` would generate: - `("cart1", "itemA")` - `("cart1", "itemB")` - `("cart1", "itemC")` - This perfectly transforms the nested array structure into the desired flat table format ### Why other options are incorrect: - **A. FILTER**: Used to filter elements from an array based on a condition, not to unnest arrays - **B. FLATTEN**: Used to flatten nested arrays (arrays of arrays), not to explode single arrays into rows - **C. REDUCE**: Used for aggregating array elements using a function, not for unnesting - **E. SLICE**: Used to extract a subset of elements from an array, not to create new rows ### Key Concept: `EXPLODE` is the standard Spark SQL function for converting array elements into individual rows, which is exactly what's needed for this data transformation task.
Author: LeetQuiz .
Ultimate access to all questions.
No comments yet.
Question 20
A junior data engineer has ingested a JSON file into a table raw_table with the following schema:
cart_id STRING,
items ARRAY<item_id:STRING>
cart_id STRING,
items ARRAY<item_id:STRING>
The junior data engineer would like to unnest the items column in raw_table to result in a new table with the following schema:
cart_id STRING,
item_id STRING
cart_id STRING,
item_id STRING
Which of the following commands should the junior data engineer run to complete this task?
A
SELECT cart_id, filter(items) AS item_id FROM raw_table;
B
SELECT cart_id, flatten(items) AS item_id FROM raw_table;
C
SELECT cart_id, reduce(items) AS item_id FROM raw_table;
D
SELECT cart_id, explode(items) AS item_id FROM raw_table;
E
SELECT cart_id, slice(items) AS item_id FROM raw_table;