
Answer-first summary for fast verification
Answer: MERGE, because it enables writing processed data to the target table while efficiently handling duplicates and deletions, ensuring consistency and scalability.
Option C is the correct answer because the MERGE command is designed to handle both duplicates and deletions from the source data, ensuring the target table remains in a consistent state. It is also scalable and cost-effective for large volumes of data. Option A and B do not handle duplicates or deletions, making them less suitable for ensuring consistency. Option D, while preventing duplication, does not address deletions or ensure consistency as comprehensively as the MERGE command.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are designing a data processing pipeline in Azure Databricks that reads from a source table, processes the data, and writes the results to a target table. The pipeline must ensure that the target table remains in a consistent state, even in the event of duplicate records or deletions from the source. Additionally, the solution must be cost-effective and scalable to handle large volumes of data. Which command should you use to achieve these requirements? (Choose one option.)
A
CREATE OR REPLACE TABLE, because it allows for the creation of a new table or replacement of an existing one with processed data, ensuring consistency without handling duplicates or deletions.
B
INSERT OVERWRITE, because it efficiently overwrites the target table with processed data, ensuring consistency but not handling duplicates or deletions.
C
MERGE, because it enables writing processed data to the target table while efficiently handling duplicates and deletions, ensuring consistency and scalability.
D
COPY INTO, because it inserts data into the target table preventing duplication but does not handle deletions or ensure consistency as effectively as other methods.
No comments yet.