
Ultimate access to all questions.
You are training a machine learning model using TensorFlow, and you have a large dataset stored as a single 5-terabyte CSV file in Google Cloud Storage. During the profiling of the model's training time, you identify performance issues related to inefficiencies in the input data pipeline. To optimize the input pipeline performance and accelerate the training process, which action should you try first?
A
Preprocess the input CSV file into a TFRecord file.
B
Randomly select a 10 gigabyte subset of the data to train your model.
C
Split into multiple CSV files and use a parallel interleave transformation.
D
Set the reshuffle_each_iteration parameter to true in the tf.data.Dataset.shuffle method.