AWS Certified AI Practitioner

Get started today

Ultimate access to all questions.

Explanation:

Analysis of the Problem

The scenario describes a classic case of overfitting: the model performs well on training data but poorly on production (unseen) data. This indicates the model has learned patterns specific to the training dataset rather than generalizable patterns that apply to real-world data.

Evaluation of Options

A: Reduce the volume of data that is used in training.

Not optimal: Reducing training data would likely worsen overfitting by giving the model fewer examples to learn from, making it more prone to memorizing noise or specific patterns in the limited dataset.
This approach contradicts established machine learning best practices for addressing overfitting.

B: Add hyperparameters to the model.

Misleading: While hyperparameter tuning (adjusting existing parameters like learning rate, regularization, or dropout) is a valid technique to combat overfitting, the phrasing "add hyperparameters" is imprecise. Models already have hyperparameters; the key is to optimize them through tuning, not add new ones arbitrarily.
This option could be misinterpreted and doesn't directly address the core issue of insufficient data diversity.

C: Increase the volume of data that is used in training.

Optimal solution: Increasing training data volume helps the model encounter more diverse examples, reducing its tendency to overfit to specific patterns in the original dataset. This improves generalization to unseen production data.
More data allows the model to learn robust features that better represent the underlying distribution of real-world item prices.
This aligns with AWS AI/ML best practices for improving model generalization and reducing overfitting.

D: Increase the model training time.

Not optimal: Longer training time without other adjustments could actually exacerbate overfitting, as the model might continue to memorize training data rather than generalize.
Training time alone doesn't address the fundamental issue of data insufficiency or lack of diversity.

Recommended Approach

The most effective immediate action is C: Increase the volume of training data. This should be complemented by:

Data augmentation techniques if additional raw data isn't available
Hyperparameter tuning (not "adding" but optimizing existing parameters)
Regularization methods like dropout or L1/L2 regularization
Cross-validation to ensure the model generalizes well

However, among the given options, increasing training data volume provides the most direct and reliable path to improving generalization from training to production environments.

Explanation:

Analysis of the Problem

Evaluation of Options

A: Reduce the volume of data that is used in training.

Not optimal: Reducing training data would likely worsen overfitting by giving the model fewer examples to learn from, making it more prone to memorizing noise or specific patterns in the limited dataset.
This approach contradicts established machine learning best practices for addressing overfitting.

B: Add hyperparameters to the model.

Misleading: While hyperparameter tuning (adjusting existing parameters like learning rate, regularization, or dropout) is a valid technique to combat overfitting, the phrasing "add hyperparameters" is imprecise. Models already have hyperparameters; the key is to optimize them through tuning, not add new ones arbitrarily.
This option could be misinterpreted and doesn't directly address the core issue of insufficient data diversity.

C: Increase the volume of data that is used in training.

Optimal solution: Increasing training data volume helps the model encounter more diverse examples, reducing its tendency to overfit to specific patterns in the original dataset. This improves generalization to unseen production data.
More data allows the model to learn robust features that better represent the underlying distribution of real-world item prices.
This aligns with AWS AI/ML best practices for improving model generalization and reducing overfitting.

D: Increase the model training time.

Not optimal: Longer training time without other adjustments could actually exacerbate overfitting, as the model might continue to memorize training data rather than generalize.
Training time alone doesn't address the fundamental issue of data insufficiency or lack of diversity.

Recommended Approach

The most effective immediate action is C: Increase the volume of training data. This should be complemented by:

Data augmentation techniques if additional raw data isn't available
Hyperparameter tuning (not "adding" but optimizing existing parameters)
Regularization methods like dropout or L1/L2 regularization
Cross-validation to ensure the model generalizes well

However, among the given options, increasing training data volume provides the most direct and reliable path to improving generalization from training to production environments.

Comments (0)

No comments yet.

A company has a model that accurately predicts item prices on training data, but its performance drops substantially after deployment to production. What steps should the company take to address this issue?

Exam-Like

Last updated: February 8, 2026 at 20:17

Reduce the volume of data that is used in training.

10.0%

Add hyperparameters to the model.

30.0%

Increase the volume of data that is used in training.

50.0%