LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Google Professional Machine Learning Engineer

Google Professional Machine Learning Engineer

Get started today

Ultimate access to all questions.


You are working on a classification problem involving time series data and have quickly achieved an AUC ROC value of 99% on your training data with minimal experimentation, without employing sophisticated algorithms or hyperparameter tuning. Given the high performance with little effort, you suspect there might be an underlying issue. The dataset is large, and computational resources are not a constraint. However, the project has strict compliance requirements, and the solution must be scalable for future data. Which of the following steps is the MOST appropriate to identify and resolve the underlying issue? Choose the best option.

Real Exam



Explanation:

Achieving an exceptionally high AUC ROC value of 99% with little effort strongly indicates the presence of data leakage, where future information might be unintentionally used in training the model. Nested cross-validation is a robust method to detect such leakage by systematically dividing the data into outer folds, training the model on the remaining folds, and evaluating it on the held-out fold. This ensures the model's performance is assessed without the influence of future data, providing a more reliable evaluation. While other strategies like simplifying the model or adjusting hyperparameters can be beneficial in different contexts, nested cross-validation is the most direct approach to identify and address data leakage in this scenario, especially given the project's compliance and scalability requirements.

Powered ByGPT-5