
Answer-first summary for fast verification
Answer: Move the first step of your pipeline to a separate step, and provide a cached path to Cloud Storage as an input to the main pipeline.
The correct answer is C. To reduce costs associated with the data export and preprocessing steps, you should move the first step of your pipeline (e.g., data export) to a separate step and provide a cached path to Cloud Storage as an input to the main pipeline. This approach leverages caching to avoid re-running expensive steps during repeated model iterations, thereby reducing overall costs.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You developed a Vertex AI pipeline that trains a classification model on data stored in a large BigQuery table. The pipeline has four steps, each created by a Python function that uses the KubeFlow v2 API. You observe high costs associated with the development, particularly during the data export and preprocessing steps. You need to reduce model development costs, especially for frequent model iterations that adjust the code and parameters of the training step. What should you do to optimize the costs?
A
Change the components’ YAML filenames to export.yaml, preprocess.yaml, f"train-{dt}.yaml", f"calibrate-{dt}.yaml".
B
Add the {"kubeflow.v1.caching": True} parameter to the set of params provided to your PipelineJob.
C
Move the first step of your pipeline to a separate step, and provide a cached path to Cloud Storage as an input to the main pipeline.
D
Change the name of the pipeline to f"my-awesome-pipeline-{dt}".