
Ultimate access to all questions.
In the context of Azure Databricks and Delta Lake, you are tasked with efficiently propagating deletes in a Delta Lake table using the Change Data Feed (CDF) feature. The solution must comply with the following constraints: minimize operational overhead, ensure data integrity, and leverage the CDF for tracking changes. Considering these requirements, which of the following methods is the BEST to achieve this goal? Choose one option.
A
Implement a custom solution that scans the entire table to identify and remove deleted records, bypassing the CDF for simplicity.
B
Use the delete method directly on the Delta Lake table to remove records, then manually update the CDF to reflect these changes.
C
Apply the update method to mark records as deleted with a flag, requiring additional queries to filter out these records in downstream processes.
D
Utilize the merge method to perform an upsert operation, incorporating a deletion flag that is propagated through the CDF, enabling efficient tracking and processing of deletes.