
Answer-first summary for fast verification
Answer: Use MERGE to insert new records and update existing ones, ensuring deduplication.
MERGE is the appropriate command for deduplicating data upon writing because it allows for conditional updates and inserts based on the presence of records in the target table. Deduplication is crucial to maintain data integrity and prevent redundant data from consuming storage and affecting query performance.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In a data integration pipeline, you need to ensure that data is deduplicated upon writing to the target table. Which SQL command would you use to achieve this, and why is deduplication important in data processing?
A
Use CREATE OR REPLACE TABLE to ensure a clean slate with no duplicates.
B
Use INSERT OVERWRITE to overwrite the entire table, thus eliminating duplicates.
C
Use MERGE to insert new records and update existing ones, ensuring deduplication.
D
Use COPY INTO to load data from external sources, inherently preventing duplicates.
No comments yet.