
Answer-first summary for fast verification
Answer: Combine the small files into larger files using a custom script or application before loading them into the destination.
Option B is the correct approach as it combines the small files into larger files, which can improve performance and reduce storage costs. This strategy minimizes the overhead of processing individual small files and optimizes the overall data pipeline. Option A is not efficient as it involves processing each file individually, while Option C may result in unnecessary storage costs for the staging area. Option D, while it can be used for certain scenarios, does not address the issue of small file processing and may not provide the desired optimization.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In a scenario where you need to implement a data pipeline that processes data from a source system with a large number of small files, which of the following strategies would you use to optimize performance and minimize storage costs in Azure Data Factory?
A
Use the 'Copy Data' activity to individually copy each file from the source system to the destination.
B
Combine the small files into larger files using a custom script or application before loading them into the destination.
C
Use the 'Wildcard' path in the 'Copy Data' activity to copy all files from the source system to a staging area and then process them in the destination.
D
Enable the 'Enable staging' option in the 'Copy Data' activity to use a temporary storage for intermediate processing.
No comments yet.