
Ultimate access to all questions.
You are building a machine learning model to predict daily temperatures based on hourly temperature data that is continuously uploaded. Initially, you randomly split the dataset into training and test sets and applied transformations to these datasets separately. During testing, your model achieved an accuracy of 97%. However, after deploying the model to a production environment, its accuracy dropped to 66%. What steps can you take to improve your model's accuracy in production?
A
Normalize the data for the training, and test datasets as two separate steps.
B
Split the training and test data based on time rather than a random split to avoid leakage.
C
Add more data to your test set to ensure that you have a fair distribution and sample for testing.
D
Apply data transformations before splitting, and cross-validate to make sure that the transformations are applied to both the training and test sets.