Databricks Certified Machine Learning - Associate

Ultimate access to all questions.

In a Spark MLlib project, you are working with a large dataset and need to build a decision tree model. Which of the following techniques can be used to improve the scalability and performance of the decision tree algorithm in Spark?

Simulated

Use a single decision tree model and increase the maximum depth of the tree to capture more complex patterns.

6.4%

Use a distributed version of the decision tree algorithm, where each node in the cluster builds a separate decision tree.

Loading comments...

Increase the minimum number of instances required to split a node in the decision tree to reduce overfitting.

10.3%

Use a combination of feature selection techniques and pruning methods to simplify the decision tree and reduce its complexity.

18.6%