
Ultimate access to all questions.
As the leader of a data science team in a global corporation, you're tasked with optimizing the team's Google Cloud compute costs without compromising the performance of your machine learning models. Currently, the team utilizes high-level TensorFlow APIs on AI Platform with GPUs for model refinement, a process that can take weeks or months. Given the constraints of budget, time, and the need for model effectiveness, which of the following strategies would you recommend? (Choose two correct options)
A
Migrate to training with Kubernetes on Google Kubernetes Engine, utilizing preemptible VMs with checkpoints to save costs and ensure training continuity.
B
Continue using AI Platform for distributed training jobs but eliminate the use of checkpoints to simplify the process.
C
Switch to training with Kubernetes on Google Kubernetes Engine, using preemptible VMs without checkpoints to maximize cost savings.
D
Optimize costs by using AI Platform for distributed training jobs with checkpoints, ensuring that training can resume after interruptions.
E
Combine the use of AI Platform for initial model training with Kubernetes on Google Kubernetes Engine for fine-tuning, using preemptible VMs and checkpoints for both phases.