Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
In a data integration pipeline, you need to ensure that data is deduplicated upon writing to the target table. Which SQL command would you use to achieve this, and why is deduplication important in data processing?
A
Use CREATE OR REPLACE TABLE to ensure a clean slate with no duplicates.
B
Use INSERT OVERWRITE to overwrite the entire table, thus eliminating duplicates.
C
Use MERGE to insert new records and update existing ones, ensuring deduplication.
D
Use COPY INTO to load data from external sources, inherently preventing duplicates.