
Answer-first summary for fast verification
Answer: Grouped Map Pandas UDF because it allows for the training and application of different models to different groups.
The Grouped Map Pandas UDF is the appropriate choice for training and applying different machine learning models to different groups within a Spark DataFrame. This API provides the necessary flexibility to handle group-specific models effectively.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You have a requirement to train and apply different machine learning models to different groups within a Spark DataFrame. Which Pandas Function API would you use to achieve this and why?
A
Scalar Pandas UDF because it applies a single model to the entire DataFrame.
B
Grouped Map Pandas UDF because it allows for the training and application of different models to different groups.
C
Iterator Pandas UDF because it processes data in chunks.
D
Grouped Aggregate Pandas UDF because it aggregates data before model application.
No comments yet.