
Ultimate access to all questions.
In the context of a large-scale machine learning project on Vertex AI, your team encounters an out-of-memory error during the evaluation step of a model training pipeline. The project has strict constraints on maintaining evaluation quality and minimizing infrastructure overhead. Additionally, the solution must comply with the organization's policy of using managed services where possible to reduce operational complexity. Which of the following solutions BEST addresses these requirements? Choose the two most appropriate options if option E is selected.
A
Migrate the pipeline to Kubeflow hosted on Google Kubernetes Engine, configuring the node parameters specifically for the evaluation step to ensure sufficient memory allocation.
B
Utilize the --runner=DataflowRunner flag in beam_pipeline_args to offload the evaluation step to Dataflow, leveraging its autoscaling capabilities to handle memory requirements dynamically.
C
Isolate the evaluation step from the pipeline and execute it on custom Compute Engine VMs with pre-configured high memory capacity, despite the increased operational overhead.
D
Implement tfma.MetricsSpec() to limit the number of metrics calculated during the evaluation step, reducing memory usage without significantly impacting evaluation quality.
E
Combine the use of Kubeflow on Google Kubernetes Engine for the pipeline with Dataflow for the evaluation step, optimizing resource allocation across both services.