
Answer-first summary for fast verification
Answer: Partition by station_id, z-order by timestamp, apply bloom filters on temperature.
Partitioning by station_id allows for efficient querying of data from specific stations. Z-ordering by timestamp helps in clustering related data together, which is beneficial for time-series analysis. Bloom filters on temperature improve the speed of lookups for specific temperature ranges.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Given a dataset of weather readings from various stations, which includes columns like station_id, temperature, humidity, and timestamp. Describe how you would apply Delta Lake optimizations such as partitioning, z-ordering, and bloom filters to this dataset to enhance query performance. Consider the typical query patterns and the size of the dataset.
A
Partition by station_id, z-order by timestamp, apply bloom filters on temperature.
B
Partition by temperature, z-order by station_id, apply bloom filters on humidity.
C
Partition by timestamp, z-order by temperature, apply bloom filters on station_id.
D
Partition by humidity, z-order by timestamp, apply bloom filters on temperature.
No comments yet.