
Explanation:
The EXPLODE function in Databricks SQL is a table-valued generator function that returns rows by un-nesting an array or map expression. It is used to transform each element of an array into a separate row, which is essential for counting the number of items in each order. The correct code block uses EXPLODE(items) to achieve this, making it the right choice for the task.
Example usage of EXPLODE:
SELECT explode(array(10, 20)) AS elem, 'Spark'; 10 Spark 20 Spark
This demonstrates how EXPLODE works by breaking down the array into individual rows.
Ultimate access to all questions.
No comments yet.
A data engineer is tasked with processing data from an e-commerce application. They have two tables: orders and customers. The schema and sample data are as follows:
orders: order_id bigint, order_customer_id bigint, items array<int>
customers: customer_id bigint, customer_name string, customer_email string
The goal is to retrieve customer details along with the number of items in each of their orders. Which of the following code blocks achieves this?
A
SELECT c.customer_id, c.customer_name, c.customer_email, o.order_id, COUNT(o.item_id) as total_items FROM customers c INNER JOIN (SELECT order_id, order_customer_id, PIVOT(items) as item_id FROM orders) o ON c.customer_id = o.order_customer_id GROUP BY c.customer_id, c.customer_name, c.customer_email, o.order_id;
B
SELECT c.customer_id, c.customer_name, c.customer_email, o.order_id, COUNT(o.item_id) as total_items FROM customers c INNER JOIN (SELECT order_id, order_customer_id, FLATTEN(items) as item_id FROM orders) o ON c.customer_id = o.order_customer_id GROUP BY c.customer_id, c.customer_name, c.customer_email, o.order_id;
C
SELECT c.customer_id, c.customer_name, c.customer_email, o.order_id, COUNT(o.item_id) as total_items FROM customers c INNER JOIN (SELECT order_id, order_customer_id, EXPLODE(items) as item_id FROM orders) o ON c.customer_id = o.order_customer_id GROUP BY c.customer_id, c.customer_name, c.customer_email, o.order_id;
D
SELECT c.customer_id, c.customer_name, c.customer_email, o.order_id, COUNT(o.item_id) as total_items FROM customers c INNER JOIN (SELECT order_id, order_customer_id, COLLECT_SET(items) as item_id FROM orders) o ON c.customer_id = o.order_customer_id GROUP BY c.customer_id, c.customer_name, c.customer_email, o.order_id;
E
SELECT c.customer_id, c.customer_name, c.customer_email, o.order_id, COUNT(o.item_id) as total_items FROM customers c INNER JOIN (SELECT order_id, order_customer_id, ARRAY_DISTINCT(items) as item_id FROM orders) o ON c.customer_id = o.order_customer_id GROUP BY c.customer_id, c.customer_name, c.customer_email, o.order_id;