
Explanation:
Correct answer: Batching the job into ten-second increments is the most effective strategy. This approach mitigates the risk of overwhelming the system with excessive API calls, which could occur if each element triggered a separate callout. Given the high volume of messages (tens of thousands per second) and the potential for significant backpressure (especially if calls average 1 second each), batching requests is advisable. Reference: Guide to Common Cloud Dataflow Use Case Patterns, Part 1.
Ultimate access to all questions.
How can you optimally design a pipeline to generate a globally unique identifier (GUID) for new website users, utilizing a service that processes data points and returns a GUID? The pipeline must efficiently manage tens of thousands of messages per second and employ multi-threading to reduce system backpressure.
A
Batch the job into ten-second increments.
B
Use HTTP calls to call the service.
C
Create a static pipeline in the class definition.
D
Create a new object in the startBundle method of DoFn.
No comments yet.