
Answer-first summary for fast verification
Answer: Batch the job into ten-second increments.
The correct answer is D. To minimize backpressure in a pipeline that processes tens of thousands of messages per second, it's effective to batch the jobs into ten-second increments. This approach allows you to throttle the rate at which requests are made to the external GUID service, reducing the likelihood of overloading the service with too many simultaneous requests. Options A, B, and C do not inherently address the backpressure issue. By batching messages, you can manage the load more effectively and evenly distribute the processing over time.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
As a Google Professional Data Engineer, you are tasked with providing new website users a globally unique identifier (GUID). The GUIDs are generated by a service that processes various data points sourced from both internal and external systems. The data retrieval from these systems occurs through HTTP calls made by microservices within your pipeline. Given the high volume, with tens of thousands of messages per second that can be handled in a multi-threaded manner, you are concerned about system backpressure. What design approach should you take for your pipeline to effectively minimize backpressure?
A
Call out to the service via HTTP.
B
Create the pipeline statically in the class definition.
C
Create a new object in the startBundle method of DoFn.
D
Batch the job into ten-second increments.
No comments yet.