Databricks Certified Data Engineer - Professional

Ultimate access to all questions.

Consider a dataset of financial transactions that includes columns like transaction_id, customer_id, amount, and timestamp. Describe how you would choose the appropriate partitioning strategy for this dataset to optimize query performance, considering the typical query patterns and the size of the dataset.

Simulated

Partition by transaction_id, as it is likely that queries will often filter by specific transactions.

10.6%

Partition by customer_id, as it is likely that queries will often filter by individual customers.

25.6%

Loading comments...