
Ultimate access to all questions.
You are designing a data pipeline where data from an external source needs to be loaded into a Delta Lake table without duplicating existing records. Which SQL command would you use for this purpose, and why is it effective in preventing data duplication?
A
Use CREATE OR REPLACE TABLE to replace the entire table with new data.
B
Use INSERT OVERWRITE to selectively overwrite partitions.
C
Use MERGE to conditionally insert or update records.
D
Use COPY INTO to load data from external sources without duplication.