Databricks Certified Data Engineer - Associate

Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.


A data engineer is tasked with processing data from an e-commerce website, stored in the events table. The table captures various customer actions such as register, search, browse, add_to_cart, and more. The goal is to create a view that displays a unique collection of actions and items in the cart for each user. Which query achieves this?





Explanation:

The correct query uses collect_set to gather unique event names for each user, flatten to merge arrays of item IDs into a single array, and array_distinct to remove duplicates from the merged array. This approach efficiently compiles a unique history of actions and cart items per user. The items.item_id syntax correctly accesses the item_id subfield within the items array of structs.