
Explanation:
The most suitable imputation strategy for a categorical variable such as 'happiness_tier' is to use the mode, which is the most frequent value in the dataset. This approach is preferred because:
Thus, imputing missing values in categorical columns with the mode is the most effective strategy.
Ultimate access to all questions.
No comments yet.
When dealing with missing values in a dataset that includes a categorical variable like 'happiness_tier', what is the most appropriate imputation strategy?
A
No imputation is necessary for categorical variables.
B
Impute using the median.
C
Impute using the mean.
D
Impute using the most common value (or mode).