
Answer-first summary for fast verification
Answer: The model is loaded just once per executor, not per batch, enhancing inference efficiency.
The primary advantage of using an Iterator in this code block is that the model only needs to be loaded once per executor, rather than once per batch during the inference process. This approach significantly reduces computational overhead associated with model loading. Spark DataFrames are inherently distributed across multiple executors for parallel processing, and the use of an Iterator does not hinder this distribution. Instead, it optimizes the inference process by minimizing redundant model loads across batches, thereby leveraging Spark's distributed processing capabilities more efficiently. Other options either misinterpret the role of Iterators in data distribution or underestimate their benefits in reducing model loading overhead.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A machine learning engineer is scaling the inference of a single-node model on a Spark DataFrame with one million records using a specific code block. What is the primary advantage of using an Iterator in this context?
A
The model will be confined to a single executor, limiting data distribution.
B
Data will be confined to a single executor, reducing the need for multiple model loads.
C
Data will be spread across multiple executors for inference processing.
D
Including an Iterator offers no advantages in this scenario.
E
The model is loaded just once per executor, not per batch, enhancing inference efficiency.
No comments yet.