
Answer-first summary for fast verification
Answer: activity_date
When selecting columns for partitioning, it's beneficial to consider columns that will have records continuously added over time, such as datetime columns. Partitioning by `activity_date` optimizes the table by allowing efficient archiving of older data and improves query performance for time-based queries.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A data engineer is tasked with creating a Delta Lake table to store user activities from a website. The schema includes: user_id LONG, page STRING, activity_type LONG, ip_address STRING, activity_time TIMESTAMP, and activity_date DATE. Which column would be the most suitable for partitioning the Delta Table?
A
user_id
B
activity_type
C
page
D
activity_time
E
activity_date
No comments yet.