
Answer-first summary for fast verification
Answer: Number of customer inquiries processed per unit of time
The question asks which metric to monitor for a customer service LLM application in production. Option A (Number of customer inquiries processed per unit of time) directly measures the application's throughput and efficiency in handling customer service tasks, which is crucial for production monitoring to ensure the system meets demand and performs reliably. The community discussion strongly supports A with 100% consensus, noting that B (Energy usage per query) is more about optimization, C (Final perplexity scores) relates to training evaluation, and D (HuggingFace Leaderboard values) pertains to model development, not production performance.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Which metric should a Generative AI Engineer monitor for a customer service LLM application that answers customer inquiries after it has been deployed in production?
A
Number of customer inquiries processed per unit of time
B
Energy usage per query
C
Final perplexity scores for the training of the model
D
HuggingFace Leaderboard values for the base LLM