
Answer-first summary for fast verification
Answer: Insert a Reshuffle operation after each processing step, and monitor the execution details in the Dataflow console.
**Correct Answer: B** - Inserting a Reshuffle operation after each processing step ensures data is evenly distributed, optimizing processing and potentially alleviating bottlenecks. Monitoring execution details in the Dataflow console helps identify where the bottleneck occurs. **Why other options are incorrect:** - **A**: Logging debug info is useful for debugging but doesn't directly address processing speed bottlenecks. - **C**: Ensuring permissions is necessary but unrelated to identifying processing bottlenecks. - **D**: Observing writing throughput doesn't directly solve processing speed issues; identifying the bottleneck comes first.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
As a maintainer of ETL pipelines, you notice a Dataflow streaming pipeline is lagging due to automatic optimization and graph merging into a single step. How can you identify the bottleneck in this scenario?
A
Log debug information in each ParDo function, and analyze the logs at execution time.
B
Insert a Reshuffle operation after each processing step, and monitor the execution details in the Dataflow console.
C
Verify that the Dataflow service accounts have appropriate permissions to write the processed data to the output sinks.
D
Insert output sinks after each key processing step, and observe the writing throughput of each block.
No comments yet.