
Answer-first summary for fast verification
Answer: Tokens provide language-independent representation
## Explanation AI models count tokens instead of characters because: - **Tokens provide language-independent representation**: Tokens represent meaningful units of text (words, subwords, or characters) that work consistently across different languages, unlike character counting which can vary significantly between languages. - **Efficient processing**: Tokens allow models to process text more efficiently by breaking it down into meaningful units rather than individual characters. - **Better semantic understanding**: Tokenization helps models understand the semantic meaning of text better than character-by-character processing. - **Consistent input size**: Token counting provides a more consistent measure of text length across different writing systems and languages. **Why other options are incorrect:** - **B**: Tokens are not always shorter than characters - in some cases, tokens can represent multiple characters. - **C**: Characters can indeed be encoded numerically (using ASCII, Unicode, etc.). - **D**: Tokens don't reduce context-window cost to zero - they help manage computational costs more efficiently, but don't eliminate them entirely.
Author: Ritesh Yadav
Ultimate access to all questions.
No comments yet.