Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
You have a requirement to train and apply different machine learning models to different groups within a Spark DataFrame. Which Pandas Function API would you use to achieve this and why?
A
Scalar Pandas UDF because it applies a single model to the entire DataFrame.
B
Grouped Map Pandas UDF because it allows for the training and application of different models to different groups.
C
Iterator Pandas UDF because it processes data in chunks.
D
Grouped Aggregate Pandas UDF because it aggregates data before model application.