
Ultimate access to all questions.
In the context of managing a Delta table with historical data, your organization requires implementing an automated data retention policy to delete data older than 2 years to comply with regulatory requirements. The solution must be cost-effective, scalable, and minimize manual intervention. Considering these constraints, which of the following approaches BEST leverages Delta Lake's features to achieve this goal? (Choose one option)
A
Manually delete the old data files from the underlying storage system, ensuring to audit the process for compliance.
B
Utilize Delta Lake's time travel feature to automatically purge data older than the specified time frame without additional configuration.
C
Configure a scheduled job using a workflow orchestration tool like Apache Airflow to execute the DELETE command on the Delta table based on the retention policy.
D
Implement a custom script that periodically checks the Delta table for old data and uses the VACUUM command to remove it, bypassing the need for scheduling.