
Answer-first summary for fast verification
Answer: Specify a file naming pattern for the destination., Filter by the last modified date of the source files.
## Detailed Explanation To minimize daily data transfer between Azure Blob Storage and Azure Data Lake Storage Gen2, the solution must focus on **incremental loading** - transferring only new or modified files each day rather than all files. ### ✅ **Correct Answers: A and C** **A: Specify a file naming pattern for the destination** - This enables proper organization of files in the {Year}/{Month}/{Day}/ folder structure - Ensures files are stored systematically, making it easier to track which files have already been copied - Supports incremental loading by maintaining clear file organization and preventing duplicate transfers **C: Filter by the last modified date of the source files** - This is the **primary mechanism** for minimizing data transfer - Allows the pipeline to copy only files modified on the current day - Prevents re-transferring files that haven't changed since the last load - Directly addresses the core requirement of reducing daily data volume ### ❌ **Incorrect Answers** **B: Delete the files in the destination before loading the data** - This would **increase** data transfer by forcing re-upload of all files daily - Contradicts the goal of minimizing transfer volume - Creates unnecessary overhead and potential data loss risk **D: Delete the source files after they are copied** - While this prevents duplicate transfers, it's not necessary for minimizing data transfer - Could disrupt source system operations and create data retention issues - Filtering by modification date achieves the same goal without destructive operations ### **Optimal Strategy** The combination of **file naming patterns (A)** and **date filtering (C)** creates an efficient incremental loading solution: - Date filtering ensures only new/modified files are transferred - Proper file organization supports tracking and prevents duplication - This approach minimizes network bandwidth usage and reduces processing time while maintaining data integrity.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are designing a solution to copy daily Parquet files from an Azure Blob Storage account to an Azure Data Lake Storage Gen2 account. The destination folder structure is {Year}/{Month}/{Day}/. The goal is to minimize the daily data transfer between the two accounts.
Which two configurations should you include in the design of the Azure Data Factory data load? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
A
Specify a file naming pattern for the destination.
B
Delete the files in the destination before loading the data.
C
Filter by the last modified date of the source files.
D
Delete the source files after they are copied.
No comments yet.