Explanation:

Explanation

In Large Language Models (LLMs), text is processed by breaking it down into smaller units called tokens. Tokens can be words, subwords, or even characters, depending on the tokenization strategy used by the model.

Key points:

Tokenization is the process of splitting text into tokens
For the word "unbelievable", it might be split into tokens like "un", "believ", "able" depending on the tokenizer
These tokens are then converted into numerical representations (embeddings) for processing
Characters would be individual letters, which is too granular for most LLMs
Embeddings are the numerical vector representations of tokens, not the sub-units themselves
Layers refer to the neural network architecture, not the text sub-units

The correct answer is B. Tokens because tokens are the fundamental sub-units that LLMs use to process and understand text.

When an LLM processes the word "unbelievable", it splits it into sub-units before encoding. What are these sub-units called?

Real Exam

Community

JJin

Last updated: February 18, 2026 at 14:05

Characters

0.0%

Tokens

76.5%

Embeddings

17.6%

Layers

5.9%

AWS Certified AI Practitioner

Get started today

Get started today

Explanation

Comments (0)

When an LLM processes the word "unbelievable", it splits it into sub-units before encoding. What are these sub-units called?

Comments (0)