
Ultimate access to all questions.
You are working on deploying a machine learning workflow from a prototype to production. Your feature engineering code is written in PySpark and currently runs on Dataproc Serverless. For model training, you use a Vertex AI custom training job. Currently, these two steps are disconnected, and the model training step must be initiated manually after the feature engineering step completes. To streamline this process and create a scalable, maintainable production workflow that runs end-to-end and tracks the connections between the steps, what should you do?
A
Create a Vertex AI Workbench notebook. Use the notebook to submit the Dataproc Serverless feature engineering job. Use the same notebook to submit the custom model training job. Run the notebook cells sequentially to tie the steps together end-to-end.
B
Create a Vertex AI Workbench notebook. Initiate an Apache Spark context in the notebook and run the PySpark feature engineering code. Use the same notebook to run the custom model training job in TensorFlow. Run the notebook cells sequentially to tie the steps together end-to-end.
C
Use the Kubeflow pipelines SDK to write code that specifies two components: The first is a Dataproc Serverless component that launches the feature engineering job. The second is a custom component wrapped in the create_custom_training_job_from_component utility that launches the custom model training job. Create a Vertex AI Pipelines job to link and run both components.
D
Use the Kubeflow pipelines SDK to write code that specifies two components: The first component initiates an Apache Spark context that runs the PySpark feature engineering code. The second component runs the TensorFlow custom model training code. Create a Vertex AI Pipelines job to link and run both components.