
Ultimate access to all questions.
You are working on a text summarization project and have tested several models. Below are the ROUGE-1, ROUGE-2, and ROUGE-L scores for different models:
| Model | ROUGE-1 | ROUGE-2 | ROUGE-L |
|---|---|---|---|
| Model A | 0.55 | 0.43 | 0.48 |
| Model B | 0.60 | 0.45 | 0.52 |
| Model C | 0.62 | 0.46 | 0.55 |
| Model D | 0.58 | 0.44 | 0.50 |
Given that ROUGE-1 measures unigram overlap, ROUGE-2 measures bigram overlap, and ROUGE-L focuses on the longest common subsequence (LCS), which model should you select for this summarization task if your goal is to prioritize overall summary quality and coherence?