Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
In a Spark MLlib implementation, you are working with a large dataset and need to build a model for a binary classification task. Which of the following algorithms would be most suitable for this task, and why?
A
Linear regression, as it can be used for both regression and classification tasks.
B
Decision trees, as they can handle non-linear relationships and complex interactions between features.
C
Logistic regression, as it is specifically designed for binary classification tasks and provides a probabilistic interpretation of the predictions.
D
Random forests, as they are an ensemble method that combines multiple decision trees to improve accuracy and robustness.