You are tasked with building a linear regression model for a financial services company that predicts customer loan approval rates. The dataset includes over a hundred input features, all scaled between -1 and 1, but preliminary analysis suggests many features may not contribute meaningful information to the model. The company emphasizes the importance of model interpretability and requires a solution that not only preserves the informative features but also clearly identifies and eliminates the non-informative ones without significantly increasing computational costs. Considering these constraints, which technique should be applied? Choose the best option.

Real Exam

Apply principal component analysis (PCA) to transform the features into a lower-dimensional space, discarding the least informative features.

5.8%

Use an iterative dropout method during training to dynamically pinpoint and remove features whose absence does not degrade the model's performance.

9.6%

Leverage L1 regularization (Lasso) during model training to automatically zero out the coefficients of non-informative features, effectively removing them from the model.

46.2%

After constructing the model, employ Shapley value analysis to identify and manually remove the least informative features based on their contribution scores.

9.6%

Combine L1 regularization to eliminate non-informative features and then apply Shapley value analysis to validate the importance of the remaining features.

28.8%

Google Professional Machine Learning Engineer

Get started today

Comments