
Answer-first summary for fast verification
Answer: Rewrite your input function using parallel reads, parallel processing, and prefetch.
The correct answer is D: Rewrite your input function using parallel reads, parallel processing, and prefetch. The highly voted answers suggest that the main issue is the data transfer between the host (CPU) and the TPU device, as indicated by the significant HostToDevice and DeviceToHost times in the trace. By optimizing the input pipeline with parallel reads, parallel processing, and prefetch operations, you can reduce the data transfer bottleneck and improve the overall training efficiency.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are training an object detection model using a Cloud TPU v2. However, you have observed that the training time is taking longer than expected. You obtained a simplified trace through the Cloud TPU profile and noticed that the compute time is relatively low compared to the HostToDevice and DeviceToHost time, suggesting a potential data transfer bottleneck. Based on this information, what action should you take to decrease the training time in a cost-efficient way?
A
Move from Cloud TPU v2 to Cloud TPU v3 and increase batch size.
B
Move from Cloud TPU v2 to 8 NVIDIA V100 GPUs and increase batch size.
C
Rewrite your input function to resize and reshape the input images.
D
Rewrite your input function using parallel reads, parallel processing, and prefetch.
No comments yet.