
Answer-first summary for fast verification
Answer: Use Vertex AI Training to submit training jobs, which is compatible with any framework, offers managed infrastructure, and integrates seamlessly with other Vertex AI services.
Vertex AI Training is the optimal choice for several reasons: - **Framework Compatibility**: It supports a wide array of frameworks, including Keras, PyTorch, theano, scikit-learn, and custom libraries, allowing data scientists to use their preferred tools without modification. - **Managed Infrastructure**: It automates hardware provisioning, resource allocation, and job scheduling, reducing the administrative burden on your team. - **Scalability**: It can efficiently scale to accommodate large-scale training jobs, ensuring resources are available as needed. - **Integration**: It seamlessly integrates with other Vertex AI services, offering a comprehensive platform for machine learning workflows. Alternatives like Slurm, Kubeflow, or VM images present challenges such as setup complexity, management overhead, and lack of scalability, making Vertex AI Training the superior option for managing training jobs effectively.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Your team of data scientists is leveraging a cloud-based backend system to submit training jobs, which has become increasingly complex to administer due to the variety of frameworks used, including Keras, PyTorch, theano, scikit-learn, and custom libraries. You are considering a managed service to streamline this process. The solution must support all these frameworks, minimize administrative overhead, and scale efficiently with the growing complexity and size of training jobs. Additionally, the solution should integrate well with other machine learning services for a comprehensive workflow. Which of the following options would be the most effective solution? (Choose one correct option)
A
Set up a Slurm workload manager to schedule and run jobs on your cloud infrastructure, requiring manual setup and management of resources.
B
Deploy Kubeflow on Google Kubernetes Engine and use TFJob for submitting training jobs, which involves managing Kubernetes clusters and may not support all frameworks natively.
C
Use Vertex AI Training to submit training jobs, which is compatible with any framework, offers managed infrastructure, and integrates seamlessly with other Vertex AI services.
D
Create a centralized repository of VM images on Compute Engine for your team to use, leading to potential scalability issues and increased management overhead.