Google Professional Machine Learning Engineer

Ultimate access to all questions.

In the process of developing a machine learning model using regression techniques for a project aimed at predicting housing prices, you've encountered a scenario where the model performs exceptionally well during training but fails to generalize during testing, leading to unsatisfactory results. A preliminary investigation reveals issues related to data quality, including wrong and missing data. Your team decides to implement a data validation tool to address these issues. Considering the constraints of budget, time, and the need for scalability, which of the following issues is NOT directly related to Data Validation? Choose the best option.

Real Exam

Duplicate examples in the dataset, which could skew the model's understanding of feature importance.

15.5%

Loading comments...