Ultimate access to all questions.
As a Professional Machine Learning Engineer for a large hotel chain, you're tasked with predicting user lifetime value (LTV) to support the marketing team's strategy for the next 20 days. The dataset, stored in BigQuery, includes a time signal across various columns. The marketing team emphasizes the importance of avoiding data leakage and ensuring the model's predictions are based on chronological data to reflect real-world scenarios accurately. Given these constraints, how should you prepare the data for AutoML Tables to fit the optimal model? Choose the best option.
Explanation:
The correct approach is to manually split the data based on the time signal, ensuring the validation set follows the training set chronologically, and the testing set follows the validation set. This prevents data leakage and respects the temporal sequence, crucial for time series data. AutoML Tables can manage transformations, but the chronological integrity of data splits must be manually maintained.