
Answer-first summary for fast verification
Answer: Use TensorFlow I/O’s BigQuery Reader to directly read the data.
The correct answer is D. Using TensorFlow I/O’s BigQuery Reader is the most efficient and scalable option for reading data directly from BigQuery into TensorFlow models. This method allows you to leverage BigQuery’s parallel processing capabilities to stream data into your TensorFlow training pipeline, minimizing processing bottlenecks and enabling scalability as your data volume grows. It eliminates the need for intermediate data movement, such as exporting to CSV files, and avoids memory limitations and potential processing bottlenecks associated with loading data into a DataFrame. Using TensorFlow I/O’s BigQuery Reader is optimized for distributed processing, reducing latency and increasing performance.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are part of a data science team at a bank tasked with creating a machine learning model to predict loan default risk. You have gathered and meticulously cleaned a vast amount of training data, consisting of hundreds of millions of records, which are now stored in a BigQuery table. Your next step is to develop and compare multiple models on this extensive dataset using TensorFlow and Vertex AI. Given the scale of the data, you want to ensure the data ingestion phase is efficient and scalable, minimizing any potential bottlenecks. What should you do to achieve this?
A
Use the BigQuery client library to load data into a dataframe, and use tf.data.Dataset.from_tensor_slices() to read it.
B
Export data to CSV files in Cloud Storage, and use tf.data.TextLineDataset() to read them.
C
Convert the data into TFRecords, and use tf.data.TFRecordDataset() to read them.
D
Use TensorFlow I/O’s BigQuery Reader to directly read the data.
No comments yet.