
Answer-first summary for fast verification
Answer: Using the VACUUM command with a retention period that optimizes storage while preserving history
The correct answer is **D. Using the VACUUM command with a retention period that optimizes storage while preserving history**. This approach is recommended because it efficiently removes older log files that are no longer needed, thus optimizing storage and maintaining query performance, without risking the loss of important historical data. - **Option A** is not advisable as disabling the retention duration check can lead to unintended data loss. - **Option B**, while useful for improving query performance through data organization, does not directly address the issue of log compaction. - **Option C** is inefficient and prone to errors, especially for tables with a large volume of transaction logs. In summary, the VACUUM command, when used with an appropriate retention period, offers the best balance between preserving necessary historical data and optimizing storage and performance.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
What is the most effective strategy for managing log compaction in a Delta Lake table with extensive transaction logs without losing historical data?
A
Disabling the retention duration check by setting spark.databricks.delta.retentionDurationCheck.enabled to false
B
Organizing log files using the OPTIMIZE command with ZORDER
C
Manually deleting older log files that exceed the retention period
D
Using the VACUUM command with a retention period that optimizes storage while preserving history
No comments yet.