AWS Certified Cloud Practitioner

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

Developers want the model to reduce repetition, especially repeating phrases or words. Which inference setting helps prevent this?

Real Exam

Community

RRitesh

Last updated: December 3, 2025 at 18:27

Increase temperature

Enable repetition penalty (e.g., 1.2)

Reduce top-p

Reduce top-k

Explanation:

Explanation

Repetition penalty is a specific inference setting designed to reduce repetitive outputs from language models. Here's why option B is correct:

Repetition penalty is a parameter that penalizes tokens that have already appeared in the generated text
A value greater than 1.0 (like 1.2) discourages repetition
A value less than 1.0 encourages repetition
The penalty is applied to the probability of tokens that have already been generated

A. Increase temperature - Temperature controls randomness/creativity:

Higher temperature (e.g., 0.8-1.0) increases randomness and diversity
Lower temperature (e.g., 0.1-0.3) makes outputs more deterministic and focused
While higher temperature can sometimes reduce repetition, it's not specifically designed for this purpose and can make outputs less coherent

C. Reduce top-p - Top-p (nucleus sampling):

Controls the cumulative probability threshold for token selection
Lower top-p values make outputs more focused but don't specifically target repetition
This affects which tokens are considered, not whether they've been used before

D. Reduce top-k - Top-k sampling:

When you notice a model repeating phrases like "the the the" or "I think I think"
Set repetition penalty to values like 1.1-1.3 to moderately discourage repetition
Values like 1.5+ can strongly discourage repetition but may affect coherence
This is particularly useful for long-form generation, creative writing, or dialogue systems where natural flow is important

For reducing repetition while maintaining coherence, use a combination of:

Powered ByGemini-3 Flash

Loading comments...