Databricks Certified Machine Learning - Associate

Ultimate access to all questions.

Explanation:

The Spark ML algorithm that can handle a mix of feature types effectively is Decision Trees. Here's why:

Linear Regression: This algorithm is limited to numerical features only.
Gradient Boosting: Although it can process categorical features, it necessitates one-hot encoding, which can expand the feature space and degrade performance.
Random Forest: Like Gradient Boosting, it can manage categorical features but faces similar challenges with one-hot encoding.
Decision Trees: This algorithm is capable of natively processing both numerical and categorical features without the need for extra transformations. It employs information gain to divide data based on features, irrespective of their type. Thus, Decision Trees stand out as the optimal choice for projects involving a combination of numerical and categorical features.

Explanation:

The Spark ML algorithm that can handle a mix of feature types effectively is Decision Trees. Here's why:

Linear Regression: This algorithm is limited to numerical features only.
Gradient Boosting: Although it can process categorical features, it necessitates one-hot encoding, which can expand the feature space and degrade performance.
Random Forest: Like Gradient Boosting, it can manage categorical features but faces similar challenges with one-hot encoding.
Decision Trees: This algorithm is capable of natively processing both numerical and categorical features without the need for extra transformations. It employs information gain to divide data based on features, irrespective of their type. Thus, Decision Trees stand out as the optimal choice for projects involving a combination of numerical and categorical features.

Comments (0)

No comments yet.

Real Exam

Linear Regression

9.7%

Random Forest

19.4%

Gradient Boosting

22.6%

Decision Trees

48.4%