
Ultimate access to all questions.
You are developing an image recognition model using PyTorch based on the ResNet50 architecture. Your code is running successfully on your local laptop using a small subsample of the data. However, your full dataset contains 200,000 labeled images, and you need to efficiently scale your training workload while minimizing costs. Considering that you plan to use 4 V100 GPUs for this task, which of the following approaches would be the most effective?
A
Create a Google Kubernetes Engine cluster with a node pool that has 4 V100 GPUs. Prepare and submit a TFJob operator to this node pool.
B
Create a Vertex AI Workbench user-managed notebooks instance with 4 V100 GPUs, and use it to train your model.
C
Package your code with Setuptools, and use a pre-built container. Train your model with Vertex AI using a custom tier that contains the required GPUs.
D
Configure a Compute Engine VM with all the dependencies needed for training. Train your model with Vertex AI using a custom tier that contains the required GPUs.