
Ultimate access to all questions.
A data engineer has realized that they made a mistake when making a daily update to a table. They need to use Delta time travel to restore the table to a version that is 3 days old. However, when the data engineer attempts to time travel to the older version, they are unable to restore the data because the data files have been deleted. Which of the following explains why the data files are no longer present?
A
The VACUUM command was run on the table
B
The TIME TRAVEL command was run on the table
C
The DELETE HISTORY command was run on the table
D
The OPTIMIZE command was run on the table
E
The HISTORY command was run on the table
Explanation:
The correct answer is A. The VACUUM command was run on the table.
VACUUM removes old data files: The VACUUM command in Delta Lake is used to physically delete data files that are no longer referenced by the Delta table and are older than the retention threshold (default is 7 days).
Time travel dependency: Delta time travel relies on the actual data files being available on storage. When you time travel to an older version, Delta Lake needs to access the data files that were part of that version.
Retention period: By default, Delta tables retain the transaction log and data files for 7 days. If VACUUM was run with a retention period less than 3 days (or the default 7 days retention was overridden), it would have physically deleted the data files needed for the 3-day-old version.
B. The TIME TRAVEL command was run on the table: There is no TIME TRAVEL command in Delta Lake. Time travel is performed using SQL syntax like SELECT * FROM table_name VERSION AS OF version_number or RESTORE TABLE table_name TO VERSION AS OF version_number.
C. The DELETE HISTORY command was run on the table: There is no DELETE HISTORY command in Delta Lake. The closest command is VACUUM, which removes old data files.
D. The OPTIMIZE command was run on the table: The OPTIMIZE command compacts small files into larger ones for better performance, but it doesn't delete old data files needed for time travel. It creates new optimized files while keeping the old ones until they're vacuumed.
E. The HISTORY command was run on the table: The HISTORY command only displays the transaction history of a Delta table; it doesn't delete any data files.
To prevent this issue, when using time travel, ensure that:
VACUUM is set appropriately (longer than your time travel needs)VACUUM table_name RETAIN 168 HOURS (7 days) or longer if you need to time travel further backVACUUM with short retention periods in production environments where time travel might be needed