
Ultimate access to all questions.
You are tasked with performing batch predictions on a BigQuery table containing 100 million records using a custom TensorFlow DNN regressor model. The predictions should be stored in another BigQuery table. Considering the need for efficiency, cost-effectiveness, and minimal setup complexity, which of the following approaches is the most suitable? Choose the best option.
A
Construct a Dataflow pipeline to transform BigQuery data into TFRecords, perform batch inference on Vertex AI Prediction, and save the results back to BigQuery. This approach requires setting up a Dataflow job and Vertex AI Prediction, which may introduce additional complexity and cost.
B
Utilize the TensorFlow BigQuery reader for data retrieval and the BigQuery API to store the prediction outcomes. This method involves custom coding for data retrieval and prediction storage, potentially increasing development time and maintenance overhead.
C
Deploy the TensorFlow SavedModel within a Dataflow pipeline. Use the BigQuery I/O connector alongside a custom function for inference within the pipeline, and output the results to BigQuery. This option combines Dataflow's scalability with custom inference logic, offering a balance between flexibility and complexity.
D
Leverage BigQuery ML to import the TensorFlow model and execute batch predictions using the ml.predict function directly within BigQuery. This approach minimizes setup complexity and leverages BigQuery's built-in capabilities for efficient batch predictions without the need for external services or custom pipelines.