
Ultimate access to all questions.
You are tasked with developing a machine learning model to forecast daily temperatures for a weather forecasting application. The dataset includes hourly temperature readings over several years. Initially, you randomly split the data into training and test sets, applied necessary transformations, and achieved a testing accuracy of 97%. However, upon deployment, the model's accuracy significantly dropped to 66%. Considering the time-sensitive nature of the data and the need for high accuracy in production, which strategies would you implement to improve the model's production accuracy? (Choose two options)
A
Normalize the training and test datasets separately to ensure each set's distribution is independently scaled.
B
Implement a time-based split for training and test data to prevent data leakage and ensure the model is not trained on future data.
C
Increase the size of the test set by adding more recent data to ensure it is representative of current weather patterns.
D
Apply all data transformations before splitting the dataset and use cross-validation to uniformly assess model performance across different subsets of the data.
E
Combine options B and D to both prevent data leakage and ensure uniform application of transformations across training and test sets.