
Ultimate access to all questions.
Your organization uses a scikit-learn fraud detection model deployed on a Vertex AI endpoint, which is currently configured with one e2-standard-2 machine (2 vCPUs, 8 GB memory). You observe that incoming traffic can spike to four times the endpoint's current capacity. What is the most cost-effective way to handle this?
A
Re-deploy the model with a TPU accelerator.
B
Change the machine type to e2-highcpu-32 with 32 vCPUs and 32 GB of memory.
C
Set up a monitoring job and an alert for CPU usage. If you receive an alert, scale the vCPUs as needed.
D
Increase the number of maximum replicas to 6 nodes, each with 1 e2-standard-2 machine.