
Ultimate access to all questions.
In the process of developing a machine learning model aimed at forecasting stock market trends, you identify that certain features exhibit a significantly wider range of values compared to others. This disparity in feature scales could potentially lead to overfitting, with the model disproportionately weighting the high-magnitude features. Considering the need for a solution that is both effective and efficient, and taking into account constraints such as computational cost and the preservation of feature information, which of the following strategies would be the MOST appropriate to implement? (Choose one correct option)
A
Apply a logarithmic transformation to adjust the scale of features, which can be particularly useful for features with exponential distributions.
B
Employ principal component analysis (PCA) to reduce the dimensionality of the data, thereby indirectly addressing the scale disparity among features.
C
Normalize the data by scaling all features to a 0 to 1 range, ensuring each feature contributes equally to the model's learning process without altering the original data distribution significantly.
D
Implement a binning system to replace each feature's value with a corresponding bin number, which simplifies the data but may lead to loss of granularity.