
Answer-first summary for fast verification
Answer: The VACUUM command was run on the table
## Explanation The correct answer is **A. The VACUUM command was run on the table**. ### Why VACUUM causes this issue: 1. **VACUUM removes old data files**: The `VACUUM` command in Delta Lake is used to physically delete data files that are no longer referenced by the Delta table and are older than the retention threshold (default is 7 days). 2. **Time travel dependency**: Delta time travel relies on the actual data files being available on storage. When you time travel to an older version, Delta Lake needs to access the data files that were part of that version. 3. **Retention period**: By default, Delta tables retain the transaction log and data files for 7 days. If `VACUUM` was run with a retention period less than 3 days (or the default 7 days retention was overridden), it would have physically deleted the data files needed for the 3-day-old version. ### Why other options are incorrect: - **B. The TIME TRAVEL command was run on the table**: There is no `TIME TRAVEL` command in Delta Lake. Time travel is performed using SQL syntax like `SELECT * FROM table_name VERSION AS OF version_number` or `RESTORE TABLE table_name TO VERSION AS OF version_number`. - **C. The DELETE HISTORY command was run on the table**: There is no `DELETE HISTORY` command in Delta Lake. The closest command is `VACUUM`, which removes old data files. - **D. The OPTIMIZE command was run on the table**: The `OPTIMIZE` command compacts small files into larger ones for better performance, but it doesn't delete old data files needed for time travel. It creates new optimized files while keeping the old ones until they're vacuumed. - **E. The HISTORY command was run on the table**: The `HISTORY` command only displays the transaction history of a Delta table; it doesn't delete any data files. ### Best Practice Recommendation: To prevent this issue, when using time travel, ensure that: 1. The retention period for `VACUUM` is set appropriately (longer than your time travel needs) 2. Use `VACUUM table_name RETAIN 168 HOURS` (7 days) or longer if you need to time travel further back 3. Be cautious when running `VACUUM` with short retention periods in production environments where time travel might be needed
Author: Keng Suppaseth
Ultimate access to all questions.
No comments yet.
A data engineer has realized that they made a mistake when making a daily update to a table. They need to use Delta time travel to restore the table to a version that is 3 days old. However, when the data engineer attempts to time travel to the older version, they are unable to restore the data because the data files have been deleted. Which of the following explains why the data files are no longer present?
A
The VACUUM command was run on the table
B
The TIME TRAVEL command was run on the table
C
The DELETE HISTORY command was run on the table
D
The OPTIMIZE command was run on the table
E
The HISTORY command was run on the table