
Answer-first summary for fast verification
Answer: Because the default data retention threshold is 7 days, data files containing deleted records will be retained until the VACUUM job is run 8 days later.
The concern revolves around Delta Lake's default data retention settings. By default, Delta Lake retains data files for 7 days (delta.deletedFileRetentionDuration) to enable time travel. The VACUUM command removes files older than this retention period. The deletion job runs Sunday at 1am, and the first VACUUM runs Monday at 3am (1 day later). However, since the default retention is 7 days, the deleted files are not yet eligible for VACUUM during the first Monday run. They become eligible 7 days after deletion. The next VACUUM job (8 days later, the following Monday) would permanently delete them. Thus, deleted data remains accessible via time travel for ~7 days until the VACUUM runs after the retention period.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data engineering team has implemented a weekly batch job to process customer data deletion requests ("right to be forgotten") at 1am every Sunday, completing within one hour. All user data is stored in Delta Lake tables with default settings. A separate VACUUM operation runs on all Delta Lake tables every Monday at 3am.
The compliance officer raises concerns about Delta Lake's time travel feature potentially allowing access to deleted data.
Assuming proper deletion logic implementation, which statement accurately resolves this compliance concern?
A
Because the VACUUM command permanently deletes all files containing deleted records, deleted records may be accessible with time travel for around 24 hours.
B
Because the default data retention threshold is 24 hours, data files containing deleted records will be retained until the VACUUM job is run the following day.
C
Because the default data retention threshold is 7 days, data files containing deleted records will be retained until the VACUUM job is run 8 days later.
D
Because Delta Lake's delete statements have ACID guarantees, deleted records will be permanently purged from all storage systems as soon as a delete job completes.