
Explanation:
The correct answer is C: Average response time.
Why C is correct: Average response time directly measures how quickly an AI model processes input and generates output during inference or operational runtime. This metric is crucial for assessing runtime efficiency because it quantifies the latency between receiving a request and delivering a response. In production environments, lower average response times indicate higher efficiency, which is essential for real-time applications like chatbots, recommendation systems, fraud detection, and interactive services where user experience depends on minimal delay.
Why other options are incorrect:
In summary, average response time is the most appropriate metric for evaluating the runtime efficiency of operating AI models, as it focuses on performance during inference, aligning with best practices for monitoring deployed AI systems in AWS and other cloud environments.
Ultimate access to all questions.
No comments yet.