
Databricks Certified Data Engineer - Associate
Get started today
Ultimate access to all questions.
A data engineer is tasked with processing data from an e-commerce website, stored in the events
table. The table captures various customer actions such as register, search, browse, add_to_cart, and more. The goal is to create a view that displays a unique collection of actions and items in the cart for each user. Which query achieves this?
A data engineer is tasked with processing data from an e-commerce website, stored in the events
table. The table captures various customer actions such as register, search, browse, add_to_cart, and more. The goal is to create a view that displays a unique collection of actions and items in the cart for each user. Which query achieves this?
Explanation:
The correct query uses collect_set
to gather unique event names for each user, flatten
to merge arrays of item IDs into a single array, and array_distinct
to remove duplicates from the merged array. This approach efficiently compiles a unique history of actions and cart items per user. The items.item_id
syntax correctly accesses the item_id
subfield within the items
array of structs.