
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
What is a key advantage of Transformers over RNN-based models?
A
They process input sequentially, maintaining word order
B
They rely on convolution filters for speed
C
They allow parallel computation and handle long-term dependencies better
D
They require fewer parameters
Explanation:
Transformers have several key advantages over RNN-based models:
RNNs process sequences sequentially (one token at a time), which makes them inherently slow for training.
Transformers process all tokens in a sequence simultaneously through self-attention mechanisms, enabling parallel computation and significantly faster training times.
RNNs suffer from vanishing/exploding gradient problems when dealing with long sequences, making it difficult to capture long-range dependencies.
Transformers use self-attention mechanisms that can directly connect any two positions in the sequence, regardless of distance, allowing them to capture long-term dependencies more effectively.
Option A is incorrect: Transformers do NOT process input sequentially - this is actually a characteristic of RNNs.
Option B is incorrect: Transformers do not rely on convolution filters; they use attention mechanisms.
Option D is incorrect: Transformers typically have MORE parameters than RNNs due to their attention mechanisms and feed-forward networks.
Self-Attention: Allows the model to weigh the importance of different words in a sequence relative to each other
Positional Encoding: Injects information about word order since Transformers don't process sequentially
Multi-Head Attention: Enables the model to focus on different parts of the sequence simultaneously
This parallel processing capability and superior handling of long-range dependencies make Transformers particularly well-suited for large-scale language modeling tasks.