Ultimate access to all questions.
You are training an object detection machine learning model on a large dataset comprising three million X-ray images, each approximately 2 GB in size. The training is conducted on Vertex AI using a Compute Engine instance with 32 cores, 128 GB of RAM, and 1 NVIDIA P100 GPU. Despite the robust computational resources, you observe that the model training process is exceptionally slow. To optimize and reduce the training time while maintaining the model's performance, what should you do?