
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
You have deployed a RAG model for document retrieval and response generation in a customer service application. Over time, you want to monitor if the performance of your model degrades, particularly in terms of its ability to generate useful and accurate responses. Which of the following approaches would be most appropriate for using MLflow to monitor model drift over time?
A
Monitor the accuracy of the retrieval step over time
B
Track the number of queries processed by the model daily
C
Monitor the change in the learning rate and number of training epochs used in fine-tuning the model
D
Regularly log BLEU and ROUGE scores on a fixed set of evaluation queries and compare them over time
Explanation:
D. Regularly log BLEU and ROUGE scores on a fixed set of evaluation queries and compare them over time
BLEU (Bilingual Evaluation Understudy) and ROUGE (Recall-Oriented Understudy for Gisting Evaluation) are standard metrics for evaluating the quality of generated text in natural language processing tasks.
This approach provides quantitative, reproducible metrics to detect model drift in response generation quality over time.