
Answer-first summary for fast verification
Answer: The input tokens exceed the model’s context size.
## Detailed Explanation When building an AI application for text summarization using large language models (LLMs), one of the most critical technical constraints is the **context window size** of the model. This refers to the maximum number of tokens (words, subwords, or punctuation) that the model can process in a single input. ### Why Option D is Correct: **Input tokens exceeding the model's context size** is the primary reason for failure when summarizing long books. Here's why: 1. **Technical Limitation**: Every LLM (whether GPT, Claude, or other models) has a fixed maximum context window (e.g., 4K, 8K, 16K, 32K, or 128K tokens). When a book's tokenized content exceeds this limit, the model cannot process the entire text in one inference call. 2. **Failure Mechanism**: The application likely attempts to feed the entire book content as input to the model. If the token count surpasses the model's context window, the API call will fail with an error (such as "context length exceeded" or similar), resulting in no summary being generated. 3. **Real-world Scenario**: Books vary significantly in length—from short novellas to lengthy technical manuals. Longer books naturally contain more tokens, making them more likely to hit this limitation. ### Why Other Options Are Incorrect: - **A. Temperature is set too high**: Temperature controls the randomness/creativity of the model's output. While an excessively high temperature might produce less coherent summaries, it wouldn't cause complete failure to generate any summary. - **B. The selected model does not support fine-tuning**: Fine-tuning is a training process to adapt a model to specific tasks. The question describes an inference scenario (summarizing during testing), not a training scenario. Even models that don't support fine-tuning can still perform summarization tasks. - **C. The Top P value is too high**: Top P (nucleus sampling) affects output diversity by controlling the probability distribution of next tokens. Like temperature, this influences output quality but doesn't prevent the model from processing inputs or generating summaries. ### Best Practices Consideration: For production applications that need to handle variable-length documents: 1. **Implement chunking strategies**: Split long texts into manageable segments that fit within the model's context window, then combine or process summaries recursively. 2. **Select appropriate models**: Choose models with larger context windows (like those with 128K+ tokens) for handling book-length content. 3. **Implement error handling**: Design the application to detect context length errors and implement fallback strategies. The described failure pattern—working for some books but not others—strongly indicates a length-dependent limitation rather than configuration or model capability issues, making D the definitive correct answer.
Ultimate access to all questions.
No comments yet.
Author: LeetQuiz Editorial Team
A company is developing an AI application to summarize books of different lengths. During testing, the application is unable to generate summaries for certain books.
What is the reason the application fails to summarize some books?
A
The temperature is set too high.
B
The selected model does not support fine-tuning.
C
The Top P value is too high.
D
The input tokens exceed the model’s context size.