
Ultimate access to all questions.
An AI practitioner is building a model to generate images of humans in various professions. The AI practitioner discovered that the input data is biased and that specific attributes affect the image generation and create bias in the model. Which technique will solve the problem?
Explanation:
Correct Answer: A. Data augmentation for imbalanced classes
Why this is correct:
Addressing data bias: The problem states that the input data is biased, which means certain classes or attributes are underrepresented or overrepresented in the training dataset. Data augmentation specifically addresses this issue by artificially creating more balanced training data.
For imbalanced classes: When dealing with biased data, certain professions or attributes may have fewer examples than others. Data augmentation techniques (such as image transformations, rotations, color adjustments, etc.) can create additional synthetic examples for underrepresented classes to balance the dataset.
Direct solution to the stated problem: The question explicitly mentions that "specific attributes affect the image generation and create bias in the model." Data augmentation for imbalanced classes directly tackles this root cause by balancing the training data.
Why other options are incorrect:
B. Model monitoring for class distribution: This helps detect bias but doesn't solve it. Monitoring identifies the problem but doesn't fix the underlying data imbalance.
C. Retrieval Augmented Generation (RAG): This is a technique for enhancing language models with external knowledge retrieval, not for addressing bias in image generation models.
D. Watermark detection for images: This is unrelated to bias mitigation and deals with identifying watermarks in images for copyright or authenticity purposes.
Key takeaway: When dealing with biased training data in machine learning, especially for generative models, addressing the data imbalance through techniques like data augmentation is a fundamental approach to reducing model bias.