
Answer-first summary for fast verification
Answer: Delta Lake automatically ignores files that have been previously loaded, requiring no additional configuration.
A: Delta Lake automatically ignores files that have been previously loaded, requiring no additional configuration. Reference: https://docs.databricks.com/aws/en/sql/language-manual/delta-copy-into
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
How can you ensure that only new files are processed in Delta Lake using the COPY INTO statement for incremental data loading to avoid duplicates?
A
Delta Lake automatically ignores files that have been previously loaded, requiring no additional configuration.
B
Manually track processed files in a separate Delta table and filter them out in the COPY INTO command.
C
Utilize the IGNORE_EXISTING option to skip over files that have already been processed.
D
Implement a custom Spark function to compare the contents of incoming files with existing data before loading.
No comments yet.