Ultimate access to all questions.
You deployed an ML model into production a year ago to serve predictions for a business-critical application. To ensure the model remains accurate, you have been collecting all raw requests sent to your model prediction service each month and sending a subset of these requests to a human labeling service to evaluate your model’s performance. Over the past year, you have observed varying patterns in model degradation—sometimes the performance drops significantly within a month, while other times it takes several months to notice such decreases. Given that the human labeling service is expensive, you need a strategy to balance the frequency of retraining the model to maintain high performance without incurring unnecessary costs. What should you do?