
Ultimate access to all questions.
In a Spark MLlib project, you are tasked with comparing the performance of linear regression and decision tree models on a large dataset. Which of the following evaluation metrics would be most appropriate for comparing the performance of these models, and why?
A
Accuracy, as it measures the proportion of correct predictions made by the model.
B
Mean squared error (MSE), as it provides a measure of the average squared difference between the predicted and actual values.
C
Area under the receiver operating characteristic (ROC) curve, as it measures the model's ability to distinguish between different classes.
D
R-squared, as it measures the proportion of the variance in the target variable that is explained by the model.