
Answer-first summary for fast verification
Answer: Add an additional class to categorical feature A for missing values. Create a new binary feature that indicates whether feature A is missing.
The correct answer is D. When dealing with a categorical feature with significant predictive power that has missing values, creating a new class for the missing values and adding a binary indicator feature is a robust approach. This method preserves the predictive power of feature A while explicitly capturing the presence or absence of data. This additional binary feature allows the model to learn when to use feature A and when to rely on other features, enhancing the overall predictive performance. Other methods, such as imputing with the mode or using correlated feature values, might introduce bias or lose important information related to the missing data.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
During an exploratory data analysis on a dataset intended for a predictive modeling task, you find that a categorical feature, referred to as feature A, shows significant predictive capability. However, there are instances where values for feature A are missing. What approach should you take to handle the missing values in feature A without losing its predictive power?
A
Drop feature A if more than 15% of values are missing. Otherwise, use feature A as-is.
B
Compute the mode of feature A and then use it to replace the missing values in feature A.
C
Replace the missing values with the values of the feature with the highest Pearson correlation with feature A.
D
Add an additional class to categorical feature A for missing values. Create a new binary feature that indicates whether feature A is missing.
No comments yet.