
Answer-first summary for fast verification
Answer: 1. Maintain the same machine type on the endpoint. Configure the endpoint to enable autoscaling based on vCPU usage. 2. Set up a monitoring job and an alert for CPU usage. 3. If you receive an alert, investigate the cause.
Option C is the correct answer. It recommends maintaining the existing machine type while configuring the endpoint to enable autoscaling based on vCPU usage. This approach ensures that the system can dynamically scale to handle increased traffic during the holiday season without manual intervention. Monitoring and alerting for CPU usage allow you to detect and investigate any potential issues early. This solution balances cost optimization with scalability and reliability, making it the most efficient choice compared to the other options, which involve either manual interventions or unnecessary preemptive scaling.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are a machine learning engineer working for an online grocery store, which uses a custom ML model to recommend recipes to users upon their arrival at the website. To optimize costs, you initially deployed this model on a Vertex AI endpoint using a single machine with 8 vCPUs and no accelerators, based on the queries per second (QPS) the model can handle. With an upcoming holiday season, you expect traffic to increase four times the usual daily amount. You need to ensure that the model can scale efficiently to meet the increased demand without significantly increasing operational costs. What should you do?
A
B
C
D