Ultimate access to all questions.
You work for a bank and are building a random forest model for fraud detection. The dataset you have includes transactions, and it is heavily imbalanced with only 1% of the transactions identified as fraudulent. Given this imbalance, which data transformation strategy would likely improve the performance of your classifier?