
Answer-first summary for fast verification
Answer: If you anticipate ingesting millions of files or more over time
Choosing between Auto Loader and the COPY INTO command depends on several factors. For ingesting a large volume of files (in the order of millions or more over time), Auto Loader is more efficient. It also offers better support for schema inference and evolution, making it suitable for scenarios where the data schema may change frequently. On the other hand, the COPY INTO command is sufficient for smaller volumes of files (in the order of thousands). Reference: [Databricks Documentation](https://docs.databricks.com/ingestion/index.html#when-to-use-copy-into-and-when-to-use-auto-loader)
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
When deciding between using Auto Loader or the COPY INTO command for incrementally loading input data files, in which scenario is Auto Loader the preferred choice?
A
If the data schema is not expected to change frequently
B
If you are loading a small subset of files that have been re-uploaded
C
If you are dealing with a few thousand files
D
If you anticipate ingesting millions of files or more over time
E
There is no significant difference between using Auto Loader and the COPY INTO command
No comments yet.