
Answer-first summary for fast verification
Answer: Implement pagination to retrieve data in smaller, manageable chunks, adhering to the API's rate limits and reducing memory usage.
Implementing pagination (B) is the best approach as it allows for the retrieval of data in smaller chunks, which can significantly reduce memory usage and comply with API rate limits, leading to improved performance. Increasing the batch size (A) might not be feasible due to API limitations and could exacerbate performance issues. Using a distributed cache (C) could help with repeated data but doesn't address the initial large data retrieval issue. Filtering data at the API level (D) is effective if supported, but not all APIs offer this functionality, making pagination a more universally applicable solution.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
As a Microsoft Fabric Analytics Engineer Associate, you are tasked with optimizing a data pipeline in Azure Data Factory that reads data from a REST API. The pipeline is experiencing performance issues due to the large volume of data being processed. You need to ensure the solution is cost-effective, scalable, and complies with API rate limits. Which of the following approaches would BEST improve the performance of the dataflow? (Choose one)
A
Increase the batch size of the read operation to its maximum allowable limit to reduce the number of API calls.
B
Implement pagination to retrieve data in smaller, manageable chunks, adhering to the API's rate limits and reducing memory usage.
C
Deploy a distributed cache to temporarily store the data retrieved from the REST API, minimizing repeated API calls for the same data.
D
Apply data filtering at the REST API level before reading it into the dataflow, assuming the API supports such operations.
No comments yet.