
Answer-first summary for fast verification
Answer: Deploy the model to a Vertex AI endpoint, and invoke this endpoint in the Dataflow job.
The best approach to minimize the serving latency while deploying your TensorFlow model for real-time prediction is to deploy the model to a Vertex AI endpoint and invoke this endpoint in the Dataflow job. Vertex AI provides a fully managed, serverless platform optimized for real-time inference, allowing for high performance and low latency. By leveraging Vertex AI endpoints, Dataflow can easily invoke the model for predictions without the overhead of loading the model as a dependency, which would add unnecessary latency and complexity. Additionally, Vertex AI offers built-in tools for model monitoring, TensorBoard, and model registry governance, making it the most efficient and scalable option for real-time prediction.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are working as a machine learning engineer for a cybersecurity organization and your task is to develop a system log anomaly detection model. After developing the model using TensorFlow, the next step is to use it for real-time predictions. You plan to create a Dataflow pipeline to ingest data via Pub/Sub and write the results to BigQuery. Considering factors such as deployment, serving latency, and efficiency, what is the best approach to minimize the serving latency while deploying your TensorFlow model for real-time prediction?
A
Containerize the model prediction logic in Cloud Run, which is invoked by Dataflow.
B
Load the model directly into the Dataflow job as a dependency, and use it for prediction.
C
Deploy the model to a Vertex AI endpoint, and invoke this endpoint in the Dataflow job.
D
Deploy the model in a TFServing container on Google Kubernetes Engine, and invoke it in the Dataflow job.
No comments yet.