
Ultimate access to all questions.
In a dataset with a numerical feature 'Income', you have observed that some values are missing. You are considering different strategies to handle these missing values. Compare and contrast imputing missing values with the mean value versus the median value, and explain the scenarios where each approach would be more appropriate.
A
Imputing missing values with the mean is always better than using the median, as it provides a more accurate representation of the central tendency.
B
Imputing missing values with the median is always better than using the mean, as it is more robust to outliers and skewed distributions.
C
Imputing missing values with the mean is more appropriate when the distribution of 'Income' is symmetric and has no extreme outliers, while using the median is more suitable when the distribution is skewed or has extreme outliers.
D
The choice between imputing with the mean or median does not matter, as both approaches will lead to the same results in the model.