
Answer-first summary for fast verification
Answer: Apply a data retention policy that uses the timestamp column to identify and manage records older than a specified retention period for archiving or deletion.
The most effective strategy for managing old data in Delta Lake, especially when compliance and cost optimization are priorities, is to implement a data retention policy based on a time-based column such as a timestamp. This approach allows for the systematic identification and processing of records that exceed the defined retention period, ensuring that only relevant data is retained. Option A is incorrect because random selection does not guarantee compliance with retention policies or efficient storage use. Option B is flawed because unique identifiers do not necessarily correlate with the age of the data. Option D is inappropriate as it focuses on access patterns rather than data age, which does not address the requirement for archiving or deleting old data.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are a data engineer working with a large dataset in Delta Lake on Microsoft Azure. Your organization requires efficient archiving or deletion of old data to comply with data retention policies and optimize storage costs. The dataset includes a timestamp column recording when each record was created. Which of the following strategies should you implement to achieve this goal? Choose the best option and explain why it is the most suitable. (Choose one option)
A
Implement a data retention policy that randomly selects records for deletion based on a non-time-based column to ensure fairness in data removal.
B
Use a data retention policy that targets records based on a unique identifier column, assuming that older records have lower unique IDs.
C
Apply a data retention policy that uses the timestamp column to identify and manage records older than a specified retention period for archiving or deletion.
D
Create a data retention policy that focuses on the most frequently accessed records, regardless of their age, to improve query performance.
No comments yet.