
Databricks Certified Data Engineer - Professional
Get started today
Ultimate access to all questions.
A data engineering team has implemented a weekly batch job to process customer data deletion requests ("right to be forgotten") at 1am every Sunday, completing within one hour. All user data is stored in Delta Lake tables with default settings. A separate weekly VACUUM operation runs at 3am every Monday across all Delta Lake tables.
Given that Delta Lake's time travel capability could potentially allow access to deleted data, and assuming proper deletion logic implementation, which statement accurately resolves the compliance officer's concern?
A data engineering team has implemented a weekly batch job to process customer data deletion requests ("right to be forgotten") at 1am every Sunday, completing within one hour. All user data is stored in Delta Lake tables with default settings. A separate weekly VACUUM operation runs at 3am every Monday across all Delta Lake tables.
Given that Delta Lake's time travel capability could potentially allow access to deleted data, and assuming proper deletion logic implementation, which statement accurately resolves the compliance officer's concern?
Explanation:
Delta Lake's default data retention settings allow time travel for up to 7 days (delta.deletedFileRetentionDuration) before files are eligible for deletion via VACUUM. In this scenario, deletions occur on Sunday, and VACUUM runs weekly on Mondays. Since the default retention is 7 days, files deleted on Sunday become eligible for VACUUM 7 days later. The next VACUUM job after eligibility occurs on the following Monday, which is 8 days after deletion. This means deleted data remains accessible via time travel for 7 days until VACUUM removes the files. Options A, B, C, and D are incorrect because they misstate retention periods, ACID guarantees, or admin access. Option E correctly aligns with the default 7-day retention and the weekly VACUUM schedule.