Google Professional Machine Learning Engineer

Get started today

Ultimate access to all questions.

Explanation:

The correct answer is C: 'Oversample the fraudulent transaction 10 times.' In a heavily imbalanced dataset, where only 1% of the transactions are fraudulent, oversampling the minority class (fraudulent transactions) can help the model learn to detect fraudulent patterns better. This technique increases the representation of the minority class, allowing the model to be exposed to more examples of fraud. Techniques like oversampling help in mitigating the bias towards the majority class, which is a common issue with imbalanced datasets. However, it's important to monitor the model's performance on a validation set to prevent overfitting. Other techniques such as undersampling the majority class or using Synthetic Minority Oversampling Technique (SMOTE) can also be explored, but in this scenario, oversampling is a straightforward and effective approach.

Explanation:

Comments (0)

No comments yet.

You work for a bank and are building a random forest model for fraud detection. The dataset you have includes transactions, and it is heavily imbalanced with only 1% of the transactions identified as fraudulent. Given this imbalance, which data transformation strategy would likely improve the performance of your classifier?

Exam-Like

Modify the target variable using the Box-Cox transformation.

4.9%

Z-normalize all the numeric features.

8.5%

Oversample the fraudulent transaction 10 times.

79.3%

Log transform all numeric features.

7.3%