
Answer-first summary for fast verification
Answer: When the entire data can fit on each core
Increasing parallelism from 4 to 8 cores would effectively accelerate the tuning process if the entire dataset can fit into the memory of each core. This is essential because, in a distributed environment, each core processes a portion of the data independently. If the dataset fits into each core's memory, doubling the cores could distribute the workload more efficiently, potentially reducing the tuning time by half. The shape of the data (options A and B) is less relevant than whether the data fits in memory. If the model can't be parallelized (option C), more cores won't help. A randomized tuning process (option D) doesn't directly affect the efficiency of using more cores. Thus, the key condition is the dataset fitting into each core's memory.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A machine learning engineer is working on scaling an ML pipeline by distributing its single-node model tuning procedure across multiple cores. After broadcasting the entire training data to each core, where each core can train one model at a time, the engineer finds the tuning process still slow. To speed it up, the engineer considers increasing the parallelism from 4 to 8 cores, but the total memory in the cluster cannot be increased. Under which condition would increasing the parallelism from 4 to 8 cores speed up the tuning process? Choose only ONE best answer.
A
When the data has a lengthy shape
B
When the data has a broad shape
C
When the model can't be parallelized
D
When the tuning process is randomized
E
When the entire data can fit on each core