
Answer-first summary for fast verification
Answer: Convert the images into TFRecords, store them in Cloud Storage, and then utilize the tf.data API to read the images for training.
The recommended approach involves converting the images into TFRecords and storing them in Cloud Storage, followed by using the tf.data API for reading during training. This method is aligned with Google's best practices for handling large datasets that do not fit in memory, as it optimizes read and write throughput. Storing data in large container formats like TFRecord files on Cloud Storage is particularly recommended for unstructured data such as images. This approach ensures scalability and efficiency in processing large volumes of data. For more details, refer to Google's best practices documentation: [ML on GCP Best Practices](https://cloud.google.com/architecture/ml-on-gcp-best-practices#store-image-video-audio-and-unstructured-data-on-cloud-storage).
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are tasked with developing an input pipeline for a machine learning model that processes a large volume of images from various sources, requiring low latency. Upon initial analysis, you realize that the input data exceeds the memory capacity of your current setup. Considering Google's best practices for handling large datasets that do not fit in memory, which of the following approaches should you implement to optimize the dataset creation and training process? Choose the best option.
A
Convert the images into TFRecords, store them in Cloud Storage, and then utilize the tf.data API to read the images for training.
B
Apply a tf.data.Dataset.prefetch transformation to your dataset to overlap the preprocessing and model execution of a training step.
C
Transform the images into tf.Tensor objects, and then use Dataset.from_tensor_slices() to create a dataset from the tensors.
D
Transform the images into tf.Tensor objects, and then apply tf.data.Dataset.from_tensors() to create a dataset from a single tensor.
No comments yet.