
Answer-first summary for fast verification
Answer: Data preparation: Rolling average feature engineering; Model training: Logistic regression with BQML and AUTO_CLASS_WEIGHTS set to True, Data preparation: Rolling average and standard deviation feature engineering; Model training: XGBoost with BQML and AUTO_CLASS_WEIGHTS set to True
**Correct Answers: D and E** Here’s why: - **Rolling Average**: This feature engineering technique effectively captures trends and patterns in sensor data over time, crucial for predicting machine failures. Adding standard deviation (as in option E) can further enhance model performance by capturing variability in sensor readings. - **Logistic Regression and XGBoost**: Both are suitable for binary classification tasks, such as predicting machine failure. Logistic regression is simpler and more cost-effective, while XGBoost can capture more complex patterns but may require more computational resources. - **BQML**: Offers a streamlined approach for training and deploying machine learning models within BigQuery, ensuring scalability and compliance with data privacy regulations. - **AUTO_CLASS_WEIGHTS**: Enabling this option addresses class imbalance, ensuring the model doesn't overlook rare failure events. Other options are less optimal: - **Daily max/min values**: While useful, they may not fully capture the temporal dynamics as effectively as rolling averages. - **AUTO_CLASS_WEIGHTS set to False**: Risks biasing the model towards the majority class in imbalanced datasets. - **AutoML classification**: While powerful, it may not be as cost-effective as logistic regression or XGBoost for this specific use case.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In your role at a manufacturing company, you're tasked with predicting failures of a high-value machine equipped with multiple sensors. Historical hourly sensor readings and failure events are stored in BigQuery. Your goal is to predict if the machine will fail within the next 3 days to schedule timely maintenance. The solution must consider cost-effectiveness, scalability for thousands of machines, and compliance with data privacy regulations. What are the optimal data preparation and model training steps? Choose the best two options.
A
Data preparation: Daily min value feature engineering; Model training: Logistic regression with BQML and AUTO_CLASS_WEIGHTS set to True
B
Data preparation: Rolling average feature engineering; Model training: Logistic regression with BQML and AUTO_CLASS_WEIGHTS set to False
C
Data preparation: Daily max value feature engineering; Model training: AutoML classification with BQML
D
Data preparation: Rolling average feature engineering; Model training: Logistic regression with BQML and AUTO_CLASS_WEIGHTS set to True
E
Data preparation: Rolling average and standard deviation feature engineering; Model training: XGBoost with BQML and AUTO_CLASS_WEIGHTS set to True