
Answer-first summary for fast verification
Answer: Imputing missing values with the mean is more appropriate when the distribution of 'Income' is symmetric and has no extreme outliers, while using the median is more suitable when the distribution is skewed or has extreme outliers.
Option C is correct. The choice between imputing missing values with the mean or median depends on the distribution of the 'Income' feature. If the distribution is symmetric and has no extreme outliers, imputing with the mean can provide a more accurate representation of the central tendency. However, if the distribution is skewed or has extreme outliers, imputing with the median is more appropriate, as it is more robust to these issues. The median is less sensitive to the influence of outliers and can provide a better central value in such cases.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In a dataset with a numerical feature 'Income', you have observed that some values are missing. You are considering different strategies to handle these missing values. Compare and contrast imputing missing values with the mean value versus the median value, and explain the scenarios where each approach would be more appropriate.
A
Imputing missing values with the mean is always better than using the median, as it provides a more accurate representation of the central tendency.
B
Imputing missing values with the median is always better than using the mean, as it is more robust to outliers and skewed distributions.
C
Imputing missing values with the mean is more appropriate when the distribution of 'Income' is symmetric and has no extreme outliers, while using the median is more suitable when the distribution is skewed or has extreme outliers.
D
The choice between imputing with the mean or median does not matter, as both approaches will lead to the same results in the model.
No comments yet.