
Ultimate access to all questions.
In the development of a random forest model for fraud detection at a bank, the dataset includes transactions with only 1% marked as fraudulent. The bank is particularly concerned about minimizing false negatives to ensure that fraudulent transactions are not missed, while also keeping computational costs reasonable. Which data transformation strategy would best enhance the classifier's effectiveness under these constraints? Choose the best option.
A
Normalize all the numeric features using the Z-score to ensure that all features contribute equally to the model's decisions.
B
Apply a log transformation to all numeric features to reduce the impact of outliers and skewness in the data.
C
Increase the fraudulent transaction amount by 10 times to oversample the minority class and improve the model's ability to detect fraud.
D
Transform the target variable using the Box-Cox transformation to stabilize variance and make the data more normally distributed.
E
Both A and C are necessary to effectively enhance the model's performance by addressing feature scaling and class imbalance simultaneously.