
Ultimate access to all questions.
The data engineering team performs a nightly full overwrite of the customer_churn_params Delta Lake table used for machine learning. To ensure data quality, the team must identify the specific row-level differences between the current table version and the version immediately preceding the update.
Which method should be used to achieve this?
A
Directly parse the JSON files within the _delta_log directory to identify and extract row-level changes from the underlying Parquet data files.
B
Execute DESCRIBE HISTORY customer_churn_params to retrieve the operation metrics and extract a detailed log of the specific records that were modified.
C
Analyze the Spark event logs in the cluster UI to identify the specific records processed during the overwrite operation.
D
Leverage Delta Lake's time travel feature to query the table at its current and previous versions, then use a set-based comparison like EXCEPT to find differences.