
Answer-first summary for fast verification
Answer: Ensure that your data is evenly distributed.
A. Incorrect. It is recommended to enable Dataflow Shuffle because it partitions and groups data by key in a scalable, efficient, fault-tolerant manner. B. Incorrect. Increasing the amount of data with the same hot key will increase hotspots, making the data processing less efficient. C. Correct. The Dataflow transformations are more performant with an evenly distributed key. D. Incorrect. Adding more compute instances does not automatically resolve hot spots in the data.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
When running Dataflow jobs, you see this error in the logs: "A hot key HOT_KEY_NAME was detected in…". You need to resolve this issue and make the workload performant.
A
Disable Dataflow shuffle.
B
Increase the data with the hot key.
C
Ensure that your data is evenly distributed.
D
Add more compute instances for processing.
No comments yet.