You are working on a deep learning project using Keras and TensorFlow. Initially, you trained your model on a single GPU, but the training process was slower than expected. To address this, you attempted to distribute the training across four GPUs using `tf.distribute.MirroredStrategy`. However, you observed no significant improvement in training time. Considering the project's constraints, including the need for cost efficiency and scalability, which of the following strategies would be the most effective to significantly accelerate the training process? Choose the best option. | Google Professional Machine Learning Engineer Quiz