
Answer-first summary for fast verification
Answer: Implement L1 regularization (Lasso) to penalize the absolute size of the coefficients, effectively shrinking the coefficients of non-informative features to zero., Employ a stepwise feature selection method, iteratively adding or removing features based on their statistical significance to refine the model.
L1 regularization, or Lasso regression, is particularly suited for scenarios with a large number of input features, many of which may be irrelevant. It introduces a penalty to the loss function proportional to the absolute values of the coefficients, encouraging sparsity by driving some coefficients to zero, thus performing feature selection. This method is efficient and scalable, making it ideal for high-dimensional datasets. While Shapley values and stepwise selection offer insights into feature importance, L1 regularization provides a more automated and scalable approach to feature selection. Combining L1 regularization with Shapley values (option E) offers a comprehensive approach but is more complex and may not be necessary for all scenarios. For more details, refer to the documentation on L1 regularization and feature selection techniques.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In the context of developing a linear regression model for a dataset comprising over 100 input features, all normalized to the range between -1 and 1, you hypothesize that a significant number of these features may not contribute meaningfully to the predictive performance of the model. Your objective is to identify and eliminate these non-informative features while preserving the original form of the informative ones. Considering the need for efficiency, scalability, and the preservation of interpretability, which of the following methods would be the most appropriate to achieve this goal? Choose the best option.
A
Employ a stepwise feature selection method, iteratively adding or removing features based on their statistical significance to refine the model.
B
Apply principal component analysis (PCA) to transform the features into a lower-dimensional space, thereby indirectly removing non-informative features.
C
Implement L1 regularization (Lasso) to penalize the absolute size of the coefficients, effectively shrinking the coefficients of non-informative features to zero.
D
Use Shapley values post-model construction to identify and manually remove features that contribute the least to the model's predictions.
E
Combine both L1 regularization for initial feature selection and Shapley values for a secondary validation of feature importance. Choose two correct options.