
Ultimate access to all questions.
A Data Scientist is utilizing a feature store and intends to replace missing values in one of the feature tables with the median value of each respective feature. A colleague argues that this approach discards valuable information. What alternative method can the Data Scientist employ to maximize the inclusion of information in the feature set? Choose the SINGLE best answer.
A
Refrain from imputing the missing values, allowing the machine learning algorithm to decide how to handle them.
B
Generate a binary feature variable for each feature with missing values, indicating whether a row's value was imputed.
C
Eliminate all feature variables that initially contained missing values from the feature set.
D
Impute the missing values using the mean value of each respective feature variable instead of the median.
E
Create a constant feature variable for each feature with missing values, showing the percentage of rows originally missing from the feature.