
Answer-first summary for fast verification
Answer: Use Vertex AI chronological split, and specify the sales timestamp feature as the time variable
The correct answer is C: Use Vertex AI chronological split, and specify the sales timestamp feature as the time variable. This approach leverages the chronological nature of the sales data, ensuring that the model is trained on historical trends. It helps in capturing temporal patterns effectively, which is crucial for making accurate predictions for a new store. A chronological split ensures that the training data consists of earlier time periods while validation and test data consist of more recent periods, thus avoiding potential data leakage and ensuring the model learns time-dependent patterns.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
As a data scientist at a retail company, you are tasked with training a sales prediction model using a managed tabular dataset in Vertex AI. The dataset contains sales data from three different stores, including features such as store name and sale timestamp. The goal is to make accurate sales predictions for a new store that will open soon. To achieve this, you need to split the data between training, validation, and test sets. Which approach should you take to split the dataset effectively?
A
Use Vertex AI manual split, using the store name feature to assign one store for each set
B
Use Vertex AI default data split
C
Use Vertex AI chronological split, and specify the sales timestamp feature as the time variable
D
Use Vertex AI random split, assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set