
Answer-first summary for fast verification
Answer: Apply data filtering at the source within the NoSQL database to reduce the volume of data read into the dataflow, ensuring only necessary data is processed.
Filtering data at the source (D) is the most effective approach as it directly addresses the root cause of the performance issue by reducing the data volume processed. This method is cost-efficient, complies with data governance by minimizing data movement, and scales well. Scaling up the integration runtime (A) may improve performance but at a higher cost and does not reduce data volume. Developing a custom connector (B) could offer performance benefits but is time-consuming and may not be necessary if existing connectors suffice. Using Azure Cache for Redis (C) can improve performance for frequently accessed data but does not solve the issue of large initial data reads.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
As a Microsoft Fabric Analytics Engineer Associate, you are tasked with optimizing a data pipeline in Azure Data Factory that reads from a NoSQL database. The pipeline's performance is degraded due to the large volume of data being processed. Considering cost efficiency, compliance with data governance policies, and the need for scalability, which of the following approaches would BEST improve the performance of the dataflow? (Choose one option)
A
Scale up the Azure Data Factory integration runtime to increase the number of nodes, thereby enhancing parallelism without modifying the data retrieval logic.
B
Develop and implement a custom connector specifically designed to optimize data retrieval from the NoSQL database, assuming no suitable connector exists.
C
Leverage Azure Cache for Redis to temporarily store and quickly access frequently queried data from the NoSQL database, reducing read operations.
D
Apply data filtering at the source within the NoSQL database to reduce the volume of data read into the dataflow, ensuring only necessary data is processed.
No comments yet.