
Ultimate access to all questions.
You are working on a project that requires executing batch predictions on a dataset of 100 million records stored in a BigQuery table using a custom TensorFlow DNN regressor model. The project has strict requirements for minimizing infrastructure complexity and operational overhead, while ensuring that the predicted results are efficiently stored back into a BigQuery table. Additionally, the solution must be cost-effective and scalable to accommodate potential increases in data volume. Considering these constraints, which approach should you take? (Choose one correct option)
A
Develop a Dataflow pipeline that converts the BigQuery data into TFRecords, uses Vertex AI Prediction for batch inference, and then writes the results back to BigQuery. This approach leverages Google Cloud's managed services but requires setting up and managing a Dataflow job.
B
Use the TensorFlow BigQuery reader for loading data directly into TensorFlow and the BigQuery API for saving predictions. This method avoids additional services but requires custom code for data loading and saving, increasing development effort.
C
Import the TensorFlow model into BigQuery ML and use the ml.predict function to generate predictions directly within BigQuery. This solution integrates seamlessly with BigQuery, eliminating the need for external services or complex pipelines.
D
Deploy the TensorFlow SavedModel within a Dataflow pipeline, utilizing the BigQuery I/O connector and a custom function for inference. This approach allows for scalable processing within Dataflow but involves more setup and management overhead.