
Answer-first summary for fast verification
Answer: Implement a Cloud Storage trigger to send a notification to a Pub/Sub topic when a new file is added. Use a Pub/Sub-triggered Cloud Function to start the training job on GKE, leveraging event-driven architecture for efficiency., Combine both Cloud Scheduler for periodic checks and Cloud Storage triggers for immediate notifications, creating a hybrid approach that ensures no new data is missed while optimizing resource usage.
The optimal solution involves using a Cloud Storage trigger to send notifications to a Pub/Sub topic upon the arrival of new files, coupled with a Pub/Sub-triggered Cloud Function to initiate the training job on GKE. This approach is both efficient and scalable, aligning with MLOps best practices. Option E introduces a hybrid model that combines the immediacy of event-driven triggers with the thoroughness of periodic checks, offering a robust solution that minimizes the risk of missing new data while conserving resources. For detailed insights, refer to Google Cloud's architecture for MLOps.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are collaborating with a data engineering team that has established a pipeline to cleanse and store datasets in a Cloud Storage bucket. Your team has developed an ML model that requires automatic updates whenever new data becomes available, as part of a CI/CD workflow. The workflow must efficiently trigger a Kubeflow Pipelines training job on Google Kubernetes Engine (GKE) without unnecessary resource consumption or delays. Considering cost-effectiveness, scalability, and real-time processing, which of the following solutions is the BEST to architect this workflow? (Choose two options if E is available)
A
Develop a lightweight Python client using App Engine to continuously monitor Cloud Storage for new files and initiate the training job upon detection, ensuring minimal latency.
B
Configure Cloud Scheduler to periodically check the Cloud Storage bucket for new files. If no new files are found since the last check, the job is terminated to save resources.
C
Implement a Cloud Storage trigger to send a notification to a Pub/Sub topic when a new file is added. Use a Pub/Sub-triggered Cloud Function to start the training job on GKE, leveraging event-driven architecture for efficiency.
D
Utilize Dataflow to process and store files in Cloud Storage, then automatically trigger the training job on GKE once the file is saved, ensuring data is processed in a streamlined manner.
E
Combine both Cloud Scheduler for periodic checks and Cloud Storage triggers for immediate notifications, creating a hybrid approach that ensures no new data is missed while optimizing resource usage.