
Ultimate access to all questions.
You are a Machine Learning Engineer at a tech company that has recently deployed a model to predict loan approvals. After three months of deployment, an audit reveals that the model's performance is significantly worse for applicants from certain demographic subgroups, raising concerns about biased outcomes. The investigation suggests that the training data was imbalanced, with underrepresented groups not adequately represented. Due to privacy regulations, collecting additional data is not an option. The company is now looking for strategies to mitigate this bias without violating compliance constraints. Which two strategies would you recommend to best address this issue? (Choose two.)
A
Document the model's limitations and provide a detailed explanation of its behavior to the stakeholders, without making any changes to the model.
B
Remove data points from overrepresented groups to balance the dataset and retrain the model, despite the reduction in overall dataset size.
C
Implement a cost-sensitive learning approach by adjusting the loss function to impose a higher penalty for misclassifications in the minority groups, and retrain the model.
D
Apply synthetic minority over-sampling technique (SMOTE) to the existing dataset to increase the representation of minority groups, and retrain the model.
E
Identify and remove features that are highly correlated with the majority group to reduce bias, and retrain the model.