Databricks Certified Machine Learning - Associate

Get started today

Ultimate access to all questions.

In a Spark MLlib project, you are working with a large dataset and need to build a model for a multi-class classification task. Which of the following algorithms would be most suitable for this task, and why?

Simulated

Linear regression, as it can be extended to handle multi-class classification tasks by using techniques such as one-vs-rest.

4.0%

Decision trees, as they can handle non-linear relationships and complex interactions between features, and can be extended to handle multi-class classification tasks.

Comments

Loading comments...

Logistic regression, as it is specifically designed for binary classification tasks and cannot be extended to handle multi-class classification tasks.

6.7%

Random forests, as they are an ensemble method that combines multiple decision trees to improve accuracy and robustness, and can be extended to handle multi-class classification tasks.

35.6%