
Answer-first summary for fast verification
Answer: Deploy the new model to the existing Vertex AI endpoint. Use traffic splitting to send 5% of production traffic to the new model. Monitor end-user metrics, such as listening time. If end-user metrics improve between models over time, gradually increase the percentage of production traffic sent to the new model.
Option C is the correct answer. Deploying the new model to the existing Vertex AI endpoint and using traffic splitting reduces complexity by avoiding the need to create and manage a new endpoint. Vertex AI provides built-in traffic splitting functionality, making it easy to send a small percentage (e.g., 5%) of production traffic to the new model. This approach allows for direct comparison of both models' performance on real user data. Monitoring end-user metrics such as listening time helps in determining the new model's effectiveness. If the new model shows improved performance, the percentage of traffic directed to it can be gradually increased. This method is efficient, minimizes risk, and ensures a seamless transition.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You work for an organization that operates a streaming music service, providing personalized music recommendations to users. You have a custom production model currently in use, which serves 'next song' recommendations based on a user's recent listening history. This model is deployed on a Vertex AI endpoint. Recently, you retrained this model using fresh and updated data, resulting in positive test performance in offline evaluations. Now, you intend to test this new model in the production environment, ensuring minimal complexity and disruption to the existing service. What should you do?
A
Create a new Vertex AI endpoint for the new model and deploy the new model to that new endpoint. Build a service to randomly send 5% of production traffic to the new endpoint. Monitor end-user metrics such as listening time. If end-user metrics improve between models over time, gradually increase the percentage of production traffic sent to the new endpoint.
B
Capture incoming prediction requests in BigQuery. Create an experiment in Vertex AI Experiments. Run batch predictions for both models using the captured data. Use the user’s selected song to compare the models performance side by side. If the new model’s performance metrics are better than the previous model, deploy the new model to production.
C
Deploy the new model to the existing Vertex AI endpoint. Use traffic splitting to send 5% of production traffic to the new model. Monitor end-user metrics, such as listening time. If end-user metrics improve between models over time, gradually increase the percentage of production traffic sent to the new model.
D
Configure a model monitoring job for the existing Vertex AI endpoint. Configure the monitoring job to detect prediction drift and set a threshold for alerts. Update the model on the endpoint from the previous model to the new model. If you receive an alert of prediction drift, revert to the previous model.