
Explanation:
Delta Lake automatically versions every commit to a table in its transaction log. Using Time Travel, you can query any historical snapshot by specifying a version number or timestamp. By querying the table 'as of' two successive versions and performing a set-based comparison (such as using the EXCEPT operator or a FULL OUTER JOIN), you can pinpoint the exact rows that changed between versions.
numOutputRows), it does not return the actual data content of the changed records.Ultimate access to all questions.
A data engineering team updates a Delta Lake table named customer_churn_params nightly by overwriting it with new data. Following each successful update, the team needs to identify the record-level differences between the current version and the previous version. Which method should be used to achieve this?
A
Execute the DESCRIBE HISTORY customer_churn_params command to retrieve the operation metrics and a log of all added or modified records.
B
Analyze the Spark event logs to identify the specific rows that were updated, inserted, or deleted during the job execution.
C
Use Delta Lake’s versioning and time travel features to execute a query (e.g., using EXCEPT or a join) that compares the current version with the previous version.
D
Manually parse the Delta Lake transaction log files (_delta_log) to identify and decode the specific Parquet files containing the new data.
No comments yet.