
Answer-first summary for fast verification
Answer: Create BLOOM FILTER index on the transactionId
**Bloom Filters** are designed for high-cardinality columns or columns with many unique values. They help quickly filter out files that don’t contain the relevant values, reducing scan time. **Z-Ordering** organizes data on disk to colocate related information, improving query performance when filtering on those columns. However, Z-Ordering works best for columns with low to medium cardinality. For high cardinality columns like `transactionId` (auto-incrementing), Z-ordering is usually not very effective, because the data is nearly unique and won’t cluster well.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
After resolving the small files problem with optimize, your queries are still slow. The column transactionId, used for filtering, has high cardinality and is auto-incrementing. Which Delta optimization can you enable to effectively filter data based on this column?
A
Increase the driver size and enable delta optimization
B
Perform Optimize with Zorder on transactionId
C
Create BLOOM FILTER index on the transactionId
D
Increase the cluster size and enable delta optimization
E
transactionId has high cardinality, you cannot enable any optimization.