
Ultimate access to all questions.
In a data engineering project, you are tasked with designing a solution to synchronize a target table in Azure Databricks with a source table that frequently receives incremental updates and deletions. The solution must ensure data consistency, handle both updates and deletions efficiently, and minimize operational costs. Considering these requirements, which command should you use and why? (Choose one option.)
A
CREATE OR REPLACE TABLE, because it allows for the complete replacement of the target table with the current state of the source table, ensuring data consistency but at a higher operational cost due to the recreation of the table.
B
INSERT OVERWRITE, because it efficiently overwrites the target table with new data from the source, but fails to account for deletions, leading to potential data inconsistency.
C
MERGE, because it provides a comprehensive solution by allowing for both incremental updates and deletions to be applied to the target table, ensuring data consistency and operational efficiency.
D
COPY INTO, because it is designed for efficiently loading data into a table and can prevent duplication, but lacks the capability to handle deletions, making it unsuitable for this scenario.