
Answer-first summary for fast verification
Answer: Implement a watermarking strategy by creating a pipeline that identifies the maximum transaction ID from the previous load and uses this as a filter for the next load.
The correct approach is to use a watermarking strategy, which involves identifying the maximum transaction ID or timestamp from the previous load and using this as a filter for the next load. This ensures that only new transactions are loaded into the data warehouse, optimizing performance and resource usage.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are tasked with designing an incremental data load strategy for a large e-commerce company that experiences daily fluctuations in sales data. The company wants to ensure that the data warehouse is updated with only the new transactions each day. Describe the steps you would take to implement this strategy using Azure Data Factory, including the use of watermarks and the integration of Azure SQL Database as the source system.
A
Use a full load approach and schedule the pipeline to run every day at midnight.
B
Implement a watermarking strategy by creating a pipeline that identifies the maximum transaction ID from the previous load and uses this as a filter for the next load.
C
Manually update the data warehouse each day with the new transactions.
D
Use a lookup activity to retrieve the latest transaction date and filter the source data based on this date.