
Answer-first summary for fast verification
Answer: AUTO LOADER
The correct answer is AUTO LOADER. Auto Loader is preferred over the COPY INTO SQL command for scenarios involving millions of files or more, as it efficiently discovers files and processes them in batches. Unlike COPY INTO, which only supports directory listing, Auto Loader utilizes a file notification method to ingest files as they arrive in cloud object storage, leveraging cloud provider queues and triggers along with Spark's structured streaming. Additionally, Auto Loader offers superior support for schema inference and evolution, making it ideal for data with frequently changing schemas.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are tasked with ingesting millions of files uploaded to cloud object storage, with the schema expected to change over time. The ingestion process must automatically handle these schema changes. Which method is best suited for incrementally ingesting this data?
A
COPY INTO
B
Structured Streaming
C
AUTO LOADER
D
AUTO APPEND
E
Checkpoint