
Answer-first summary for fast verification
Answer: Implement a hybrid partitioning scheme that partitions recent data daily and historical data monthly to optimize for query access patterns.
The most efficient way to structure partitions for time-based query optimization in this scenario is to implement a hybrid partitioning scheme. This approach involves partitioning recent data daily and historical data monthly. Daily partitioning of recent data ensures optimal performance for queries on this data, as it is broken down into smaller, more manageable chunks, allowing for faster query processing and retrieval of specific time frames. Monthly partitioning of historical data optimizes query access patterns by enabling efficient retrieval of data over longer periods, reducing the amount of data scanned for queries spanning multiple years. This hybrid approach balances optimization for both recent and historical data queries, ensuring optimal performance across the board.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are optimizing a lakehouse for complex time-based queries against a decade of historical data, with a focus on ensuring optimal performance for queries on recent data (up to one month old). How should you structure your partitions?
A
Use a flat partitioning scheme based on ingestion time, relying on the lakehouse's automatic optimization features to handle query performance.
B
Create separate tables for historical and recent data, with recent data partitioned by day and historical data by year, and use a view to unify them for querying.
C
Partition data by year, then by month, applying Z-ordering on the most queried columns within the recent data partition.
D
Implement a hybrid partitioning scheme that partitions recent data daily and historical data monthly to optimize for query access patterns.
No comments yet.