
Answer-first summary for fast verification
Answer: They allow parallel computation and handle long-term dependencies better
Transformers have a key advantage over RNN-based models because they allow **parallel computation** and handle **long-term dependencies** better. Unlike RNNs that process sequences sequentially, Transformers use self-attention mechanisms to process all tokens simultaneously, making them more efficient for training. Additionally, they mitigate the vanishing gradient problem that RNNs face with long sequences, enabling better capture of dependencies across distant tokens.
Author: Ritesh Yadav
Ultimate access to all questions.
Q4 – What is a key advantage of Transformers over RNN-based models?
A
They process input sequentially, maintaining word order
B
They rely on convolution filters for speed
C
They allow parallel computation and handle long-term dependencies better
D
They require fewer parameters
No comments yet.