
Answer-first summary for fast verification
Answer: Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline. Use the same code in the endpoint.
The correct answer is B. To ensure that the data preprocessing logic is applied consistently between training and real-time serving, it's best to refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline. By using the same code in the endpoint, you ensure that the data is preprocessed in the exact same way as it was during training, thus maintaining consistency. This is important for achieving accurate and reliable predictions in your real-time inference setup. Other options either introduce potential inconsistencies, require end-user intervention which is not ideal, or could add undesirable latency.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You have trained a machine learning model using data that was preprocessed in a batch Dataflow pipeline. Now, you need the model to provide real-time inference as part of a production system. To ensure accuracy and consistency in predictions, it's crucial that the data preprocessing logic used during training is applied consistently during serving as well. How can you best achieve this?
A
Perform data validation to ensure that the input data to the pipeline is the same format as the input data to the endpoint.
B
Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline. Use the same code in the endpoint.
C
Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline. Share this code with the end users of the endpoint.
D
Batch the real-time requests by using a time window and then use the Dataflow pipeline to preprocess the batched requests. Send the preprocessed requests to the endpoint.
No comments yet.