
Answer-first summary for fast verification
Answer: Use a boosted tree model. Perform label encoding on categorical features, and transform the date column into numeric values.
The question asks for the most effective combination to maximize prediction accuracy for product sales using BigQuery ML. Option B is optimal because: (1) Boosted tree models (like XGBoost in BigQuery ML) generally outperform linear regression for tabular data with mixed feature types, as they handle non-linear relationships and interactions better; (2) Label encoding is suitable for tree-based models, which can efficiently split on encoded categorical values without the dimensionality explosion of one-hot encoding; (3) Transforming the date column into numeric values (e.g., Unix timestamp) allows the model to capture temporal trends effectively. In contrast, option A uses linear regression, which is less flexible for complex patterns, and one-hot encoding may increase sparsity unnecessarily. Options C and D are unsuitable: autoencoders are for unsupervised learning, and matrix factorization is for recommendation systems, not sales prediction. The community discussion shows mixed votes (A: 56%, B: 44%), but B's upvoted comments highlight boosted trees' effectiveness, and industry best practices favor tree-based models for such regression tasks.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are developing a machine learning model to predict product sales using historical data stored in BigQuery. The data includes features like date, store location, product category, and promotion details. To maximize prediction accuracy, what is the most effective combination of a BigQuery ML model type and feature engineering strategy?
A
Use a linear regression model. Perform one-hot encoding on categorical features, and create additional features based on the date, such as day of the week or month.
B
Use a boosted tree model. Perform label encoding on categorical features, and transform the date column into numeric values.
C
Use an autoencoder model. Perform label encoding on categorical features, and normalize the date column.
D
Use a matrix factorization model. Perform one-hot encoding on categorical features, and create interaction features between the store location and product category variables.