
Explanation:
Option C is the correct approach. By using a high-water mark technique, you can keep track of the last processed 'last_modified_date' value. When loading new data, you can use a SQL query to select only the records with a 'last_modified_date' greater than the high-water mark. This ensures that only new or updated records are loaded into the Data Warehouse, reducing the amount of data processed and improving performance.
Ultimate access to all questions.
No comments yet.
You are tasked with designing a solution for incrementally loading data from a source system to an Azure Data Warehouse. The source system has a 'last_modified_date' column in each table that indicates the last time a record was updated. How would you design the solution to ensure that only new or updated records are loaded into the Data Warehouse?
A
Use a full load approach and overwrite the existing data in the Data Warehouse every time.
B
Create a new table in the Data Warehouse for each incremental load and merge the data manually.
C
Use a combination of a high-water mark technique and a SQL query to filter out the new or updated records based on the 'last_modified_date'.
D
Disable the 'last_modified_date' column and rely on the source system to automatically handle the incremental load.