
Answer-first summary for fast verification
Answer: Implement tombstone markers to logically mark records as deleted without removing them from the dataset, thus retaining all data changes for compliance while maintaining dataset performance.
Tombstone markers in Delta Lake are designed to logically mark records as deleted without physically removing them from the dataset. This approach is ideal for scenarios requiring compliance with data retention policies, as it keeps a complete history of data changes. It also maintains dataset performance by avoiding the overhead of physical deletions. Option A is incorrect because physical deletions would violate compliance requirements. Option C is incorrect because it eventually removes data, which could violate retention policies. Option D is incorrect because it complicates the archiving process without providing clear benefits over option B.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are a data engineer working on a project that requires managing a large dataset in Delta Lake. The project has strict compliance requirements that mandate the retention of all data changes for audit purposes, while also needing to ensure the dataset remains performant for analytics. Your team decides to implement tombstone markers for efficient data archiving. Considering the scenario, which of the following best describes the use of tombstone markers in Delta Lake to meet both compliance and performance requirements? Choose the best option from the four provided.
A
Implement tombstone markers to physically delete records from the dataset, ensuring the dataset's performance is optimized by reducing its size.
B
Implement tombstone markers to logically mark records as deleted without removing them from the dataset, thus retaining all data changes for compliance while maintaining dataset performance.
C
Implement tombstone markers to logically mark records as deleted and then physically remove them from the dataset after a certain period, balancing compliance and performance.
D
Implement tombstone markers to logically mark records as deleted and then archive only the deleted records separately, keeping the main dataset performant.