A company is launching a mobile app for foreign language learning that uses a large language model (LLM) to improve text coherence. They have compiled a diverse text dataset and augmented it with examples of more readable versions. They want the LLM's output to closely match the style and quality of these enhanced examples. Which metric should the company use to evaluate if the LLM's outputs align with these provided examples? | AWS Certified AI Practitioner Quiz - LeetQuiz