
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
A summarization LLM produces outputs that are too short and miss important details. What's the best adjustment?
A
Increase top-p from 0.7 → 0.9
B
Reduce temperature to 0.2
C
Set max_tokens higher
D
Increase repetition penalty
Explanation:
When a summarization LLM produces outputs that are too short and miss important details, the most direct solution is to increase the max_tokens parameter.
max_tokens controls output length: This parameter sets the maximum number of tokens the model can generate in its response. If outputs are too short, increasing this value allows the model to generate more detailed summaries.
Directly addresses the problem: The issue described is specifically about outputs being "too short and miss important details" - increasing token limit directly solves the length constraint.
Option A (Increase top-p): Top-p (nucleus sampling) controls diversity by limiting token selection to the smallest set whose cumulative probability exceeds p. Increasing it from 0.7 to 0.9 makes the model consider more diverse tokens but doesn't directly address output length.
Option B (Reduce temperature): Temperature controls randomness - lower values (like 0.2) make outputs more deterministic and focused, but this might actually make summaries even more concise rather than more detailed.
Option D (Increase repetition penalty): This parameter discourages repetition of words/phrases. While it might improve quality by reducing redundancy, it doesn't directly address the issue of outputs being too short.
For summarization tasks, you might also consider adjusting the prompt to explicitly request more detailed summaries
The min_tokens parameter (if available) could also be set to ensure a minimum output length
Different models have different optimal token limits for summarization tasks