
Answer-first summary for fast verification
Answer: Add parallel interleave to the pipeline
The question describes a scenario with a native synchronous implementation where training data is split across multiple files, and the goal is to reduce input pipeline execution time. Option D (Add parallel interleave to the pipeline) is the correct choice because parallel interleave allows multiple files to be read and processed concurrently, overlapping I/O and preprocessing operations. This directly addresses the bottleneck in synchronous implementations by enabling parallelism when data is distributed across multiple files. The community discussion strongly supports this with 100% consensus on D, citing TensorFlow documentation and the need for parallelization when data is split into multiple files. Other options are less suitable: A (Increase CPU load) doesn't address the synchronous bottleneck, B (Add caching) helps with repeated data reads but not initial pipeline speed for multiple files, and C (Increase network bandwidth) is irrelevant as the issue is local pipeline synchronization, not network throughput.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You have a native synchronous implementation for model training and observe GPU utilization issues. Your training data is split across multiple files. To reduce input pipeline execution time, what should you do?
A
Increase the CPU load
B
Add caching to the pipeline
C
Increase the network bandwidth
D
Add parallel interleave to the pipeline
No comments yet.