
Ultimate access to all questions.
You work for a bank and are building a random forest model for fraud detection. The dataset you have includes transactions, and it is heavily imbalanced with only 1% of the transactions identified as fraudulent. Given this imbalance, which data transformation strategy would likely improve the performance of your classifier?
A
Modify the target variable using the Box-Cox transformation.
B
Z-normalize all the numeric features.
C
Oversample the fraudulent transaction 10 times.
D
Log transform all numeric features.