Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
In a Spark MLlib project, you are working with a large dataset and need to build a model for a multi-class classification task. Which of the following algorithms would be most suitable for this task, and why?
A
Linear regression, as it can be extended to handle multi-class classification tasks by using techniques such as one-vs-rest.
B
Decision trees, as they can handle non-linear relationships and complex interactions between features, and can be extended to handle multi-class classification tasks.
C
Logistic regression, as it is specifically designed for binary classification tasks and cannot be extended to handle multi-class classification tasks.
D
Random forests, as they are an ensemble method that combines multiple decision trees to improve accuracy and robustness, and can be extended to handle multi-class classification tasks.