Google Professional Machine Learning Engineer

Get started today

Ultimate access to all questions.

You are tasked with optimizing the input/output performance of a TensorFlow model that processes a structured dataset comprising 100 billion records distributed across multiple CSV files. The solution must consider cost-effectiveness, scalability, and compatibility with TensorFlow's ecosystem. Which of the following strategies would you recommend? (Choose one.)

Real Exam

Transform the CSV files into shards of TFRecords and utilize Cloud Storage for data storage, leveraging TensorFlow's native support for TFRecords and Cloud Storage's scalability.

76.9%

Comments

Loading comments...

Load the data into BigQuery and access the data directly from BigQuery, taking advantage of its serverless architecture and SQL interface for data manipulation.

Convert the CSV files into shards of TFRecords and store them in the Hadoop Distributed File System (HDFS), utilizing HDFS's distributed storage capabilities.

Import the data into Cloud Bigtable and read the data from Cloud Bigtable, benefiting from its low-latency and high-throughput capabilities for large-scale datasets.

11.5%