You are tasked with developing a machine learning model to forecast daily temperatures for a weather forecasting application. The dataset includes hourly temperature readings over several years. Initially, you randomly split the data into training and test sets, applied necessary transformations, and achieved a testing accuracy of 97%. However, upon deployment, the model's accuracy significantly dropped to 66%. Considering the time-sensitive nature of the data and the need for high accuracy in production, which strategies would you implement to improve the model's production accuracy? (Choose two options)

Real Exam

Normalize the training and test datasets separately to ensure each set's distribution is independently scaled.

11.1%

Implement a time-based split for training and test data to prevent data leakage and ensure the model is not trained on future data.

50.0%

Increase the size of the test set by adding more recent data to ensure it is representative of current weather patterns.

8.3%

Apply all data transformations before splitting the dataset and use cross-validation to uniformly assess model performance across different subsets of the data.

25.0%

Combine options B and D to both prevent data leakage and ensure uniform application of transformations across training and test sets.

5.6%

Google Professional Machine Learning Engineer

Get started today

Comments