
Ultimate access to all questions.
As a Professional Machine Learning Engineer for a large hotel chain, you're tasked with predicting user lifetime value (LTV) to support the marketing team's strategy for the next 20 days. The dataset, stored in BigQuery, includes a time signal across various columns. The marketing team emphasizes the importance of avoiding data leakage and ensuring the model's predictions are based on chronological data to reflect real-world scenarios accurately. Given these constraints, how should you prepare the data for AutoML Tables to fit the optimal model? Choose the best option.
A
Combine all time-related columns into an array and let AutoML interpret it. Split the data automatically into training, validation, and testing sets.
B
Submit the data for training without manual transformations and let AutoML handle the appropriate transformations. Split the data automatically into training, validation, and testing sets.
C
Submit the data for training without manual transformations and indicate an appropriate column as the Time column. Let AutoML split the data based on the time signal. Reserve the most recent data for validation and testing sets.
D
Submit the data for training without manual transformations. Manually split the data based on the columns with a time signal. Ensure that the data in the validation set is from 30 days after the data in the training set and that the data in the testing set is from 30 days after the validation set.