Explanation
This is a regression problem because:
- Continuous Target Variable: The goal is to predict "next quarter's return" which is a continuous numerical value (e.g., 5.2%, -3.1%, etc.)
- Ranking Purpose: After predicting returns for all 100 stocks, we identify the 20 with the lowest estimated returns for replacement, which requires ranking based on predicted continuous values
Why not the other options:
- Classification (B): Would be used if we were predicting discrete categories (e.g., "buy/hold/sell" or "positive/negative return")
- K-means (C): A clustering technique for grouping similar data points, not for predicting continuous values
- PCA (D): A dimensionality reduction technique, not a prediction method
Machine Learning Context:
- Regression algorithms (linear regression, random forest regression, gradient boosting regression, etc.) are designed to predict continuous outcomes
- The features (fundamental and technical variables) serve as inputs to predict the continuous target variable (stock returns)
- The ranking and selection of worst-performing stocks is a natural application of regression predictions