Ultimate access to all questions.
You are working on a machine learning project using Vertex AI Workbench and experimenting with a distributed XGBoost model. To split your dataset into training and validation sets, you use BigQuery and run the following SQL queries: CREATE OR REPLACE TABLE ‘myproject.mydataset.training‘ AS (SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() <= 0.8); CREATE OR REPLACE TABLE ‘myproject.mydataset.validation‘ AS (SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() <= 0.2); After training the model with these sets, you achieve an area under the receiver operating characteristic curve (AUC ROC) value of 0.8. However, upon deploying the model to production, the AUC ROC value drops significantly to 0.65. What is the most likely cause of this problem?