You need to schedule an Azure Data Factory pipeline to run when a new file arrives in an Azure Data Lake Storage Gen2 container.
Which type of trigger should you use? | Microsoft Azure Data Engineer Associate - DP-203 Quiz - LeetQuiz
Microsoft Azure Data Engineer Associate - DP-203
Explanation:
Trigger Selection for Azure Data Factory Pipeline
When configuring an Azure Data Factory pipeline to execute based on file arrival events in Azure Data Lake Storage Gen2, the storage event trigger is the optimal choice.
Why Storage Event Trigger (Option D) is Correct:
Event-Driven Execution: The storage event trigger monitors Azure Blob Storage (including ADLS Gen2) for specific events such as blob creation, deletion, or modification.
File Arrival Detection: It can be configured to trigger when a new file is created in a specified container, which directly matches the requirement of executing when a new file arrives.
Real-Time Processing: This trigger provides near real-time execution without the need for continuous polling, making it efficient for file processing scenarios.
Azure Integration: It leverages Azure Event Grid to monitor storage account events, providing reliable event delivery.
Why Other Options Are Less Suitable:
On-demand (Option A): This trigger requires manual execution and doesn't automatically respond to file arrival events.
Tumbling Window (Option B): This is a time-based trigger that executes at fixed time intervals, not in response to specific file events.
Schedule (Option C): This is also time-based and executes at predetermined times, regardless of whether new files have arrived.
Best Practice Considerations:
Storage event triggers are specifically designed for scenarios where pipeline execution should be driven by storage-level events.
They provide better resource utilization compared to scheduled triggers that might run unnecessarily when no new files are present.
The configuration allows filtering by blob path prefixes and suffixes, enabling precise control over which file events should trigger the pipeline.
Get started today
Ultimate access to all questions.
You need to schedule an Azure Data Factory pipeline to run when a new file arrives in an Azure Data Lake Storage Gen2 container.