
Answer-first summary for fast verification
Answer: Partition by timestamp, as it is likely that queries will often filter by time ranges.
Partitioning by timestamp is often the most effective strategy for datasets where time-based queries are common. This allows for efficient pruning of partitions based on time ranges, which is a typical query pattern in datasets of user activities.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In a scenario where you have a dataset of user activities on a social media platform, which includes columns like user_id, activity_type, timestamp, and location. Explain how you would decide on the appropriate partitioning strategy for this dataset to optimize query performance. Consider the frequency of queries and the nature of the data.
A
Partition by user_id, as it is likely that queries will often filter by individual users.
B
Partition by activity_type, as it is likely that queries will often filter by the type of activity.
C
Partition by timestamp, as it is likely that queries will often filter by time ranges.
D
Partition by location, as it is likely that queries will often filter by geographical data.
No comments yet.