
Ultimate access to all questions.
In the context of data modeling using Delta Lake on Microsoft Azure, a data engineer is tasked with implementing a solution to manage the lifecycle of sensitive customer data efficiently. The solution must ensure data is automatically expired after a specified duration to comply with data retention policies and minimize storage costs. Which of the following approaches BEST meets these requirements? Choose one option.
A
Implementing a custom application logic that periodically scans and deletes data older than the specified duration, which requires additional development and maintenance efforts.
B
Using the ALTER TABLE ... SET TBLPROPERTIES statement with the delta.ttl property to set a Time To Live (TTL) on the Delta Lake table, enabling automatic expiration of data after the specified duration.
C
Migrate data to an external storage system that supports TTL natively, such as Azure Blob Storage with lifecycle management policies, and then deleting the original data from Delta Lake.
D
Creating a scheduled job that uses Azure Data Factory to copy data to a new Delta Lake table every time the data is about to expire, and then deleting the old table.