
Answer-first summary for fast verification
Answer: MERGE, because it provides a comprehensive solution by allowing for both incremental updates and deletions to be applied to the target table, ensuring data consistency and operational efficiency.
Option C is the correct answer because the MERGE command is specifically designed to handle both incremental updates and deletions from a source table to a target table efficiently. It ensures data consistency by synchronizing the target table with the source table, including processing deletions, which is crucial for the given requirements. Option A, while ensuring data consistency, does so at a higher operational cost and is not efficient for frequent updates. Option B fails to handle deletions, leading to data inconsistency. Option D, although efficient for loading data, cannot process deletions, making it inadequate for the scenario described.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In a data engineering project, you are tasked with designing a solution to synchronize a target table in Azure Databricks with a source table that frequently receives incremental updates and deletions. The solution must ensure data consistency, handle both updates and deletions efficiently, and minimize operational costs. Considering these requirements, which command should you use and why? (Choose one option.)
A
CREATE OR REPLACE TABLE, because it allows for the complete replacement of the target table with the current state of the source table, ensuring data consistency but at a higher operational cost due to the recreation of the table.
B
INSERT OVERWRITE, because it efficiently overwrites the target table with new data from the source, but fails to account for deletions, leading to potential data inconsistency.
C
MERGE, because it provides a comprehensive solution by allowing for both incremental updates and deletions to be applied to the target table, ensuring data consistency and operational efficiency.
D
COPY INTO, because it is designed for efficiently loading data into a table and can prevent duplication, but lacks the capability to handle deletions, making it unsuitable for this scenario.