
Answer-first summary for fast verification
Answer: Replace using Probabilistic PCA
The question specifies that 'the data does not require the application of predictors for each column,' which eliminates methods like MICE (Multiple Imputation by Chained Equations) that use predictive models based on other columns to impute missing values. Probabilistic PCA (PPCA) is a dimensionality reduction technique that can handle missing data without requiring predictors for each column, making it suitable for this scenario. Community discussion shows mixed opinions, but comments supporting A (PPCA) have higher upvotes (e.g., 3 upvotes for Hisayuki's reasoning) and align with the constraint of not using predictors. While some comments favor D (MICE), they contradict the question's requirement. Normalization (B) and SMOTE (C) are not primarily for handling missing data.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are creating a new experiment in Azure Machine Learning Studio. You have a small dataset with missing values in many columns. Applying predictors for each column is not required. You plan to use the Clean Missing Data module.
Which data cleaning method should you select?
A
Replace using Probabilistic PCA
B
Normalization
C
Synthetic Minority Oversampling Technique (SMOTE)
D
Replace using MICE
No comments yet.