Microsoft Certified Azure Data Scientist Associate - DP-100

Get started today

Ultimate access to all questions.

Explanation:

The question specifies that 'the data does not require the application of predictors for each column,' which eliminates MICE (Multiple Imputation by Chained Equations) as it uses predictive models (regression) to impute missing values based on relationships between columns. Probabilistic PCA (PPCA) is optimal here because it handles missing values through dimensionality reduction without requiring predictors for each column, making it suitable for small datasets with many missing columns. Community discussion shows mixed opinions, but the highest upvoted comments (3 upvotes) support PPCA, and the consensus from detailed reasoning aligns with PPCA being the correct choice given the constraint against using predictors. Other options like Normalization and SMOTE are not primarily for missing data imputation.

Explanation:

Comments (0)

No comments yet.

You are creating a new experiment in Azure Machine Learning Studio. You have a small dataset with missing values in many columns. Applying predictors for each column is not required. You plan to use the Clean Missing Data module.

Which data cleaning method should you select?

Exam-Like

Replace using Probabilistic PCA

Microsoft Certified Azure Data Scientist Associate - DP-100

Get started today

Comments (0)

Get started today

You are creating a new experiment in Azure Machine Learning Studio. You have a small dataset with missing values in many columns. Applying predictors for each column is not required. You plan to use the Clean Missing Data module. Which data cleaning method should you select?

Comments (0)

You are creating a new experiment in Azure Machine Learning Studio. You have a small dataset with missing values in many columns. Applying predictors for each column is not required. You plan to use the Clean Missing Data module.

Which data cleaning method should you select?