
Answer-first summary for fast verification
Answer: Delta Lake time travel is not efficient in terms of cost or latency for long-term historical versioning and auditing.
While Delta Lake’s **time travel** feature allows you to query prior table versions, it is intended for short-term retention (the defaults are 7 days for data files and 30 days for transaction logs). For **long-term auditing**, relying on time travel requires significantly increasing these retention settings, which leads to several issues: 1. **Increased Storage Costs**: Bloats storage by keeping every underlying Parquet file and log entry indefinitely. 2. **Performance Degradation**: As the transaction log grows, operations like `DESCRIBE HISTORY` or `RESTORE` become more expensive and slower. In contrast, an **SCD Type 2** design explicitly models each change with 'effective' and 'expiry' timestamps. This allows for performant, cost-controlled access to historical rows through standard partitioning and indexing, without the unbounded growth of the transaction log. **Note on other options:** - Delta Lake transactions are atomic, so multi-field updates do not risk corruption. - Delta Lake files are immutable; they are never modified in place. - Shallow clones do not resolve the performance issues inherent in long-term log retention.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A data architect is evaluating two methods for auditing historical street addresses in a customers table. They are comparing the use of Delta Lake’s time travel on a Type 1 table for long-term auditing versus implementing a standard SCD Type 2 table.
Which of the following critical factors should influence the decision to choose a Type 2 table over relying solely on time travel?
A
SCD Type 2 tables require updating multiple fields in a single operation, which can lead to data corruption if a query fails during a partial update.
B
Delta Lake time travel is not efficient in terms of cost or latency for long-term historical versioning and auditing.
C
Delta Lake time travel cannot access previous versions of Type 1 tables because changes modify data files in place.
D
Combining shallow clones with Type 1 tables is the standard architectural pattern to speed up historical queries for long-term versioning.
No comments yet.