
Ultimate access to all questions.
You are conducting exploratory data analysis on a dataset and encounter an important categorical feature that has 5% missing values. To ensure the integrity of your analysis and to minimize any potential bias from these missing values, which of the following approaches would be the best way to handle these missing values?
A
Remove the rows with missing values, and upsample your dataset by 5%.
B
Replace the missing values with the feature’s mean.
C
Replace the missing values with a placeholder category indicating a missing value.
D
Move the rows with missing values to your validation dataset.