AWS Certified AI Practitioner

Get started today

Ultimate access to all questions.

Explanation:

Explanation

When a model has a fixed context window of 8,000 tokens and receives 10,000 tokens as input, the model cannot process all tokens simultaneously due to its architectural constraints. Here's what happens:

Context Window Limitation: The context window represents the maximum number of tokens the model can process at once. This is a fixed architectural constraint determined during model training.
Truncation Process: The model will truncate the input by removing excess tokens beyond its 8,000-token limit. Typically, this involves:
- Keeping the most recent 8,000 tokens (if using a sliding window approach)
- Or keeping the first 8,000 tokens and discarding the rest
- Or using a specific truncation strategy defined by the model implementation
Why Not Other Options:
- A (Automatic expansion): Models cannot automatically expand their context windows as this is a fundamental architectural constraint that requires retraining or specialized techniques.
- C (Compression into embeddings): While embeddings represent tokens, they don't compress token count - each token still requires its own embedding representation.
- D (Permanent failure): Models don't fail permanently; they simply process what they can within their constraints.
Real-world Implications: This truncation can lead to loss of important context, especially for long documents or conversations. Some advanced techniques like chunking or hierarchical processing can help mitigate this limitation.

Correct Answer: B - The model truncates or ignores excess tokens.

Explanation:

Explanation

Context Window Limitation: The context window represents the maximum number of tokens the model can process at once. This is a fixed architectural constraint determined during model training.
Truncation Process: The model will truncate the input by removing excess tokens beyond its 8,000-token limit. Typically, this involves:
- Keeping the most recent 8,000 tokens (if using a sliding window approach)
- Or keeping the first 8,000 tokens and discarding the rest
- Or using a specific truncation strategy defined by the model implementation
Why Not Other Options:
- A (Automatic expansion): Models cannot automatically expand their context windows as this is a fundamental architectural constraint that requires retraining or specialized techniques.
- C (Compression into embeddings): While embeddings represent tokens, they don't compress token count - each token still requires its own embedding representation.
- D (Permanent failure): Models don't fail permanently; they simply process what they can within their constraints.
Real-world Implications: This truncation can lead to loss of important context, especially for long documents or conversations. Some advanced techniques like chunking or hierarchical processing can help mitigate this limitation.

Correct Answer: B - The model truncates or ignores excess tokens.

Comments (0)

No comments yet.

If a model has a context window of 8,000 tokens, what happens when the user inputs 10,000 tokens?

Real Exam

Community

JJin

Last updated: February 18, 2026 at 14:05

The model automatically expands its context window

0.0%

The model truncates or ignores excess tokens

94.4%

The model compresses input into embeddings to fit

5.6%

The model fails permanently