
Answer-first summary for fast verification
Answer: Set up a Cloud Storage trigger to publish a notification to a Pub/Sub topic whenever a new file is added. Use a Pub/Sub-triggered Cloud Function to launch the training job on GKE, providing an event-driven and scalable solution., Combine both the event-driven approach using Pub/Sub and Cloud Functions for immediate response and the scheduled checks with Cloud Scheduler as a fallback mechanism to ensure no data is missed.
The most efficient and scalable method to automate your CI/CD workflow for updating your ML model is option C, leveraging an event-driven architecture with Pub/Sub and Cloud Functions for immediate and cost-effective response to new data. Option E provides a comprehensive solution by combining the immediacy of event-driven triggers with the reliability of scheduled checks, ensuring both responsiveness and coverage. Options A and B, while feasible, either introduce continuous polling overhead or potential delays. Option D adds unnecessary complexity by integrating data processing with model training.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Your team is working on a machine learning project where data is continuously cleaned and stored in a Cloud Storage bucket. To ensure your ML model stays up-to-date with the latest data, you plan to automate the retraining process using Kubeflow Pipelines on Google Kubernetes Engine (GKE) as part of a CI/CD workflow. The solution must be cost-effective, scalable, and responsive to new data arrivals without unnecessary delays or overhead. Considering these requirements, what is the best way to architect this workflow? Choose the two most appropriate options.
A
Develop a lightweight application using App Engine that continuously monitors the Cloud Storage bucket for new files and initiates the training job upon detection, ensuring immediate response to new data.
B
Configure Cloud Scheduler to periodically check the Cloud Storage bucket for new files. If new files are detected since the last check, it triggers the training job on GKE. This approach ensures regular checks but may introduce delays.
C
Set up a Cloud Storage trigger to publish a notification to a Pub/Sub topic whenever a new file is added. Use a Pub/Sub-triggered Cloud Function to launch the training job on GKE, providing an event-driven and scalable solution.
D
Utilize Dataflow to process and save files in Cloud Storage, with a subsequent step to automatically start the training job on GKE. This method integrates data processing and model training but may add complexity.
E
Combine both the event-driven approach using Pub/Sub and Cloud Functions for immediate response and the scheduled checks with Cloud Scheduler as a fallback mechanism to ensure no data is missed.