
You are a data scientist at an industrial equipment manufacturing company responsible for optimizing energy usage. Your task is to develop a regression model to estimate the power consumption in the company’s manufacturing plants. This model will be based on extensive sensor data collected from all plants, amounting to tens of millions of records daily. You need to schedule daily training runs for your model that incorporate all the data collected up to the current date. The goal is for the model to scale efficiently and require minimal development work. Considering these requirements, what should you do?
A
Train a regression model using AutoML Tables.
B
Develop a custom TensorFlow regression model, and optimize it using Vertex AI Training.
C
Develop a custom scikit-learn regression model, and optimize it using Vertex AI Training.
D
Develop a regression model using BigQuery ML.
Explanation:
Given the large volume of sensor data, tens of millions of records every day, the best solution would be to use BigQuery ML. BigQuery ML can scale efficiently to handle massive datasets and allows you to run SQL queries directly to build and train machine learning models. This approach requires minimal development work compared to developing custom models and can handle the daily training updates with all collected data up to the current date smoothly. Furthermore, AutoML Tables have limitations on the number of rows, making BigQuery ML a more suitable choice for this scenario.
Ultimate access to all questions.