
Answer-first summary for fast verification
Answer: Median imputation can introduce bias if the missingness is related to the target variable.
Using median imputation for a numerical feature where the missing values are not missing at random can introduce bias into the model. This is because the median imputation assumes that the missing values are randomly distributed, which is not the case if the missingness is related to the target variable. This bias can affect the model's ability to generalize to new data. An alternative approach to handle such missing data could be using a more sophisticated imputation method, such as multiple imputation or using a machine learning model to predict the missing values based on other features.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Discuss the potential drawbacks of using median imputation for a numerical feature in a dataset where the missing values are not missing at random. Explain how this could affect the model's ability to generalize to new data and suggest an alternative approach to handle such missing data.
A
Median imputation can introduce bias if the missingness is related to the target variable.
B
Median imputation is always unbiased and appropriate for all types of missing data.
C
Median imputation should not be used for numerical features.
D
Median imputation is only suitable for categorical features.
No comments yet.