
Explanation:
According to the official Databricks documentation: https://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader/ , Auto Loader supports two file detection modes:
Directory listing mode – Periodically lists files in the source directory and compares them to previously processed files.
File notification mode – Uses cloud-native notification services (e.g., AWS SQS, Azure Event Grid, GCP Pub/Sub) to detect new files more efficiently.
Directory listing mode – Periodically lists files in the source directory and compares them to previously processed files.
File notification mode – Uses cloud-native notification services (e.g., AWS SQS, Azure Event Grid, GCP Pub/Sub) to detect new files more efficiently.
Checkpointing is used to track processed files and ensure exactly-once processing, but it is not a file detection method. Watermarking is used in streaming to handle late-arriving data, not for detecting new files.
Ultimate access to all questions.
No comments yet.