
Ultimate access to all questions.
You are tasked with building a linear regression model for a financial services company that predicts customer loan approval rates. The dataset includes over a hundred input features, all scaled between -1 and 1, but preliminary analysis suggests many features may not contribute meaningful information to the model. The company emphasizes the importance of model interpretability and requires a solution that not only preserves the informative features but also clearly identifies and eliminates the non-informative ones without significantly increasing computational costs. Considering these constraints, which technique should be applied? Choose the best option.
A
Apply principal component analysis (PCA) to transform the features into a lower-dimensional space, discarding the least informative features.
B
Use an iterative dropout method during training to dynamically pinpoint and remove features whose absence does not degrade the model's performance.
C
Leverage L1 regularization (Lasso) during model training to automatically zero out the coefficients of non-informative features, effectively removing them from the model.
D
After constructing the model, employ Shapley value analysis to identify and manually remove the least informative features based on their contribution scores.
E
Combine L1 regularization to eliminate non-informative features and then apply Shapley value analysis to validate the importance of the remaining features.