AWS Certified Cloud Practitioner

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

Why do AI models count tokens instead of characters?

Real Exam

Community

RRitesh

Last updated: December 3, 2025 at 18:26

Tokens provide language-independent representation

Tokens are always shorter than characters

Characters cannot be encoded numerically

Tokens reduce context-window cost to zero

Explanation:

Explanation

Correct Answer: A - Tokens provide language-independent representation

Why tokens are used instead of characters:

Language Independence: Tokens represent meaningful units of text rather than individual characters. For example, in English, a token might be a word or subword, while in languages like Chinese, tokens represent characters or character combinations that carry meaning.
Efficiency: Tokenization allows models to process text more efficiently by breaking it into meaningful units rather than individual characters, which reduces the computational complexity.
Semantic Representation: Tokens capture semantic meaning better than individual characters. For instance, the word "unbelievable" as a single token carries more meaning than processing each character separately.
Vocabulary Management: Tokenization helps manage vocabulary size by using subword tokenization techniques (like Byte Pair Encoding) that can handle rare words and out-of-vocabulary terms.

Why the other options are incorrect:

B: Tokens are NOT always shorter than characters - some tokens can be longer than individual characters, especially in languages where single characters represent entire words.
C: Characters CAN be encoded numerically (using character encoding like UTF-8), so this statement is false.
D: Tokens do NOT reduce context-window cost to zero - they still consume computational resources, though they are more efficient than character-level processing.

Key Takeaway: Tokenization is fundamental to how modern language models process text, allowing them to work efficiently across different languages while capturing meaningful semantic units.

Powered ByGemini-3 Flash

Comments

Loading comments...