Google Professional Machine Learning Engineer

Ultimate access to all questions.

You are training a machine learning model using TensorFlow, and you have a large dataset stored as a single 5-terabyte CSV file in Google Cloud Storage. During the profiling of the model's training time, you identify performance issues related to inefficiencies in the input data pipeline. To optimize the input pipeline performance and accelerate the training process, which action should you try first?

Exam-Like

Last updated: December 15, 2025 at 14:04

Preprocess the input CSV file into a TFRecord file.

26.3%

Randomly select a 10 gigabyte subset of the data to train your model.

6.4%

Loading comments...