
Answer-first summary for fast verification
Answer: Configure a Cloud Storage trigger to send a message to a Pub/Sub topic when a new file is available in a storage bucket. Use a Pub/Sub-triggered Cloud Function to start the training job on a GKE cluster.
The correct answer is C. This approach leverages event-driven architecture by configuring a Cloud Storage trigger that sends a message to a Pub/Sub topic when a new file becomes available in the storage bucket. A Pub/Sub-triggered Cloud Function then starts the training job on a GKE cluster. This method ensures that the training job is initiated automatically as soon as new data is available, which fits well with the objective of automatically refreshing the ML model as part of the CI/CD workflow. Other options involve constantly polling for new files or scheduling checks at regular intervals, which can lead to unnecessary resource usage or delays.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are working as a machine learning engineer in collaboration with a data engineering team. This team has developed a data pipeline that cleans datasets and saves them into a Cloud Storage bucket. Now, you have created a machine learning model, and you need to ensure that your model is updated with new data as soon as it becomes available. As part of your Continuous Integration/Continuous Deployment (CI/CD) workflow, you aim to automatically run a Kubeflow Pipelines training job on a Google Kubernetes Engine (GKE) cluster. What approach should you take to design this workflow?
A
Configure your pipeline with Dataflow, which saves the files in Cloud Storage. After the file is saved, start the training job on a GKE cluster.
B
Use App Engine to create a lightweight python client that continuously polls Cloud Storage for new files. As soon as a file arrives, initiate the training job.
C
Configure a Cloud Storage trigger to send a message to a Pub/Sub topic when a new file is available in a storage bucket. Use a Pub/Sub-triggered Cloud Function to start the training job on a GKE cluster.
D
Use Cloud Scheduler to schedule jobs at a regular interval. For the first step of the job, check the timestamp of objects in your Cloud Storage bucket. If there are no new files since the last run, abort the job.