LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Databricks Certified Machine Learning - Associate

Databricks Certified Machine Learning - Associate

Get started today

Ultimate access to all questions.


A data scientist has developed a linear regression model using log(price) as the target variable. After making predictions, both the predicted and actual values are stored in a Spark DataFrame named preds_df. The data scientist evaluates the model using the following code: regression_evaluator.setMetricName("rmse").evaluate(preds_df). What modification is necessary to the RMSE evaluation approach to ensure it's comparable with the original price scale? Choose the single best answer.

Real Exam




Explanation:

The correct answer is C. Since the model predicts log(price), the predictions are on a log scale. To compare these predictions directly with the original price values, we must first convert the predictions back to the original scale using the exponentiation function (e.g., np.exp in Python). Only then should we calculate the RMSE to assess the model's accuracy in terms of actual prices. This approach ensures that the RMSE reflects the model's performance in the context of the original price scale, making it meaningful for practical interpretation.

  • Option A is incorrect because it suggests calculating the MSE of log-transformed predictions, which doesn't address the need to revert predictions to the original scale.
  • Option B is incorrect as applying the logarithm to already log-transformed predictions doesn't convert them back to the original scale.
  • Option D and Option E are incorrect because they involve manipulating the RMSE value itself rather than the predictions, which doesn't achieve the desired comparability with the original price scale.
Powered ByGPT-5