
Ultimate access to all questions.
How can you ensure that only new files are processed in Delta Lake using the COPY INTO statement for incremental data loading to avoid duplicates?
A
Delta Lake automatically ignores files that have been previously loaded, requiring no additional configuration.
B
Manually track processed files in a separate Delta table and filter them out in the COPY INTO command.
C
Utilize the IGNORE_EXISTING option to skip over files that have already been processed._
D
Implement a custom Spark function to compare the contents of incoming files with existing data before loading.