Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
Optimizing Query Performance in Delta Lake: For a Databricks job that processes terabytes of data daily, which data partitioning strategy would best enhance query performance on the processed data stored in Delta Lake, given diverse query patterns?
A
Employing a custom partitioner that dynamically adapts partitions according to query workload and access patterns
B
Partitioning data by date, under the assumption that most queries filter based on a date range
C
Not partitioning the data and depending solely on Delta Lake's optimization features, such as Z-ordering, to boost query performance
D
Adopting a range partitioning strategy centered around a column that's frequently queried to ensure uniform data distribution