
Answer-first summary for fast verification
Answer: Use a MERGE statement with a timestamp-based condition to perform the upsert operation.
Using a MERGE statement with a timestamp-based condition ensures that the upsert operation is both efficient and maintains data integrity. This method leverages the capabilities of Delta Lake to handle frequent updates and large datasets without compromising performance.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are required to upsert batch data into a Delta Lake, where the data includes sensor readings that are updated every few minutes. Describe the strategy you would use to ensure that the upsert operation is efficient and maintains data integrity.
A
Use a MERGE statement with a timestamp-based condition to perform the upsert operation.
B
Delete the existing data and insert the new data each time to ensure consistency.
C
Perform the upsert operation using a Python script that checks for duplicates before inserting.
D
Manually compare the new data with the existing data in the Delta Lake and update records one by one.