LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


How can you ensure that only new files are processed in Delta Lake using the COPY INTO statement for incremental data loading to avoid duplicates?

Real Exam



Explanation:

B. Manually track processed files in a separate Delta table and filter them out in the COPY INTO command. This approach is the most efficient for ensuring only new files are processed. It provides full control over the data loading process, avoiding duplicates without relying on Delta Lake's automatic features, which may not offer sufficient control for all scenarios. A. Incorrect, as Delta Lake does not automatically ignore previously loaded files without additional configuration. C. Incorrect, because the IGNORE_EXISTING option is not a valid parameter for the COPY INTO command in Delta Lake. D. While technically possible, this method is more complex and less efficient than manually tracking processed files.

Powered ByGPT-5