Google Professional Machine Learning Engineer

Get started today

Ultimate access to all questions.

You are working on a machine learning project where you need to train a TensorFlow model on a structured dataset containing 100 billion records, stored across multiple CSV files. The project is constrained by tight deadlines and a limited budget, requiring an efficient solution that minimizes costs while maximizing performance. Additionally, the solution must be scalable to accommodate future data growth. Given these constraints, which of the following approaches would BEST improve the input/output execution performance for training your TensorFlow model? Choose the two most effective options.

Real Exam

Load the data into Cloud Bigtable, and read the data from Bigtable, leveraging its high throughput and low latency for NoSQL operations.

3.6%

Convert the CSV files into shards of TFRecords, and store the data in the Hadoop Distributed File System (HDFS), utilizing its distributed nature for large datasets.

Comments

Loading comments...