
Answer-first summary for fast verification
Answer: Partition by store_id, z-order by product_id, apply bloom filters on timestamp.
Partitioning by store_id allows for efficient querying of data from specific stores. Z-ordering by product_id helps in clustering related data together, which is beneficial for product-specific analysis. Bloom filters on timestamp improve the speed of lookups for specific time ranges.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Given a dataset of product sales from a retail chain, which includes columns like product_id, store_id, sales_amount, and timestamp. Describe how you would apply Delta Lake optimizations such as partitioning, z-ordering, and bloom filters to this dataset to enhance query performance. Consider the typical query patterns and the size of the dataset.
A
Partition by product_id, z-order by timestamp, apply bloom filters on sales_amount.
B
Partition by store_id, z-order by product_id, apply bloom filters on timestamp.
C
Partition by timestamp, z-order by sales_amount, apply bloom filters on store_id.
D
Partition by sales_amount, z-order by timestamp, apply bloom filters on product_id.