
Ultimate access to all questions.
You are working on productionizing a machine learning model that was built using the Keras framework. This ML model was initially developed in a Jupyter notebook on a data scientist’s local machine. The notebook includes cells for data validation and model analysis. Your task is to orchestrate these steps and automate their execution to enable weekly retraining of the model. Considering that you expect a significant increase in future training data, your solution should leverage managed services to ensure it remains scalable and cost-effective. Given this context, what approach should you take?
A
Move the Jupyter notebook to a Notebooks instance on the largest N2 machine type, and schedule the execution of the steps in the Notebooks instance using Cloud Scheduler.
B
Write the code as a TensorFlow Extended (TFX) pipeline orchestrated with Vertex AI Pipelines. Use standard TFX components for data validation and model analysis, and use Vertex AI Pipelines for model retraining.
C
Rewrite the steps in the Jupyter notebook as an Apache Spark job, and schedule the execution of the job on ephemeral Dataproc clusters using Cloud Scheduler.
D
Extract the steps contained in the Jupyter notebook as Python scripts, wrap each script in an Apache Airflow BashOperator, and run the resulting directed acyclic graph (DAG) in Cloud Composer.