
Answer-first summary for fast verification
Answer: Increase the number of maximum replicas to 6 nodes, each with 1 e2-standard-2 machine.
The question requires the most cost-effective approach to handle fluctuating traffic that can spike to four times the current capacity. Option D (increasing maximum replicas to 6 nodes with autoscaling) is optimal because it enables horizontal scaling, allowing the system to automatically add replicas during traffic spikes and remove them during low traffic, ensuring cost efficiency by only using resources when needed. This aligns with Vertex AI best practices for handling variable workloads. Option A is unsuitable as scikit-learn models don't support TPU acceleration. Option B provides a fixed, larger machine type, which is not cost-effective for fluctuating traffic as it over-provisions resources during low usage. Option C involves manual intervention after alerts, which is reactive and less efficient than automated autoscaling. The community discussion strongly supports D (75% consensus) due to its autoscaling capability for cost-effective traffic handling.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your organization uses a scikit-learn fraud detection model deployed on a Vertex AI endpoint, which is currently configured with one e2-standard-2 machine (2 vCPUs, 8 GB memory). You observe that incoming traffic can spike to four times the endpoint's current capacity. What is the most cost-effective way to handle this?
A
Re-deploy the model with a TPU accelerator.
B
Change the machine type to e2-highcpu-32 with 32 vCPUs and 32 GB of memory.
C
Set up a monitoring job and an alert for CPU usage. If you receive an alert, scale the vCPUs as needed.
D
Increase the number of maximum replicas to 6 nodes, each with 1 e2-standard-2 machine.
No comments yet.