
Answer-first summary for fast verification
Answer: Transform the CSV files into shards of TFRecords and utilize Cloud Storage for data storage, leveraging TensorFlow's native support for TFRecords and Cloud Storage's scalability.
Transforming the CSV files into TFRecords and storing them in Cloud Storage is the optimal strategy. TFRecords are TensorFlow's binary storage format, designed for high performance with TensorFlow, enabling faster input/output operations. Cloud Storage enhances this by supporting concurrent read and write operations, reducing I/O latency and improving efficiency. This approach is cost-effective, scalable, and fully compatible with TensorFlow's ecosystem.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are tasked with optimizing the input/output performance of a TensorFlow model that processes a structured dataset comprising 100 billion records distributed across multiple CSV files. The solution must consider cost-effectiveness, scalability, and compatibility with TensorFlow's ecosystem. Which of the following strategies would you recommend? (Choose one.)
A
Transform the CSV files into shards of TFRecords and utilize Cloud Storage for data storage, leveraging TensorFlow's native support for TFRecords and Cloud Storage's scalability.
B
Load the data into BigQuery and access the data directly from BigQuery, taking advantage of its serverless architecture and SQL interface for data manipulation.
C
Convert the CSV files into shards of TFRecords and store them in the Hadoop Distributed File System (HDFS), utilizing HDFS's distributed storage capabilities.
D
Import the data into Cloud Bigtable and read the data from Cloud Bigtable, benefiting from its low-latency and high-throughput capabilities for large-scale datasets.
No comments yet.