
Explanation:
The correct code snippet to perform median imputation on the 'Age' feature is 'from sklearn.impute import SimpleImputer imputer = SimpleImputer(strategy='median') imputer.fit(df[['Age']]) df['Age'] = imputer.transform(df[['Age']])'. The 'strategy='median'' parameter ensures that missing values are replaced with the median of the available data, which is less sensitive to outliers and can better preserve the shape of the distribution.
Ultimate access to all questions.
No comments yet.
Consider a dataset with a numerical feature 'Age' having missing values. Write a code snippet to perform median imputation on this feature using Python and the scikit-learn library. Explain how this process handles missing values.
A
from sklearn.impute import SimpleImputer imputer = SimpleImputer(strategy='median') imputer.fit(df[['Age']]) df['Age'] = imputer.transform(df[['Age']])
B
from sklearn.impute import SimpleImputer imputer = SimpleImputer(strategy='mean') imputer.fit(df[['Age']]) df['Age'] = imputer.transform(df[['Age']])
C
from sklearn.impute import SimpleImputer imputer = SimpleImputer(strategy='mode') imputer.fit(df[['Age']]) df['Age'] = imputer.transform(df[['Age']])
D
from sklearn.impute import SimpleImputer imputer = SimpleImputer(strategy='constant', fill_value=0) imputer.fit(df[['Age']]) df['Age'] = imputer.transform(df[['Age']])