An LLM is generating overly long and repetitive answers. The engineering team wants to force the model to sample from only the most likely next tokens. Which setting is most suitable? | AWS Certified Cloud Practitioner Quiz - LeetQuiz