Google Professional Data Engineer

Google Professional Data Engineer

Get started today

Ultimate access to all questions.


When running Dataflow jobs, you see this error in the logs: "A hot key HOT_KEY_NAME was detected in…". You need to resolve this issue and make the workload performant.




Explanation:

A. Incorrect. It is recommended to enable Dataflow Shuffle because it partitions and groups data by key in a scalable, efficient, fault-tolerant manner. B. Incorrect. Increasing the amount of data with the same hot key will increase hotspots, making the data processing less efficient. C. Correct. The Dataflow transformations are more performant with an evenly distributed key. D. Incorrect. Adding more compute instances does not automatically resolve hot spots in the data.

Powered ByGPT-5