
Answer-first summary for fast verification
Answer: Migrate data to an external storage system that supports TTL natively, such as Azure Blob Storage with lifecycle management policies, and then deleting the original data from Delta Lake.
Delta Lake does not have a native delta.ttl property for automatic data expiration. To implement TTL-like behavior, you must either schedule periodic DELETE operations followed by VACUUM in Databricks, or use external storage lifecycle management (e.g., Azure Blob Storage or ADLS Gen2 policies) to automatically delete files after a set duration. Among the given options, C is the most accurate, as it leverages Azure’s native lifecycle policies for automated data removal without custom code or complex maintenance.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In the context of data modeling using Delta Lake on Microsoft Azure, a data engineer is tasked with implementing a solution to manage the lifecycle of sensitive customer data efficiently. The solution must ensure data is automatically expired after a specified duration to comply with data retention policies and minimize storage costs. Which of the following approaches BEST meets these requirements? Choose one option.
A
Implementing a custom application logic that periodically scans and deletes data older than the specified duration, which requires additional development and maintenance efforts.
B
Using the ALTER TABLE ... SET TBLPROPERTIES statement with the delta.ttl property to set a Time To Live (TTL) on the Delta Lake table, enabling automatic expiration of data after the specified duration.
C
Migrate data to an external storage system that supports TTL natively, such as Azure Blob Storage with lifecycle management policies, and then deleting the original data from Delta Lake.
D
Creating a scheduled job that uses Azure Data Factory to copy data to a new Delta Lake table every time the data is about to expire, and then deleting the old table.