
Answer-first summary for fast verification
Answer: Use Dataprep to transform the state column using a one-hot encoding method, and make each city a column with binary values.
The correct answer is D. Using Dataprep to transform the city column using a one-hot encoding method and making each city a column with binary values ensures that the city name variable is retained as a predictor while organizing the data in a columnar format. This allows for minimal coding and leverages BigQuery's built-in support for one-hot encoding of categorical features, which is essential for linear regression models.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are developing a linear regression model using BigQuery ML to predict the likelihood that a customer will purchase your company's products. One of your key predictive variables is the name of the city where the customer is located. For effective model training and serving, your data needs to be organized in a tabular format with each feature as a column. To convert this city name variable into a format suitable for the model with minimal coding effort, what should you do?
A
Use TensorFlow to create a categorical variable with a vocabulary list. Create the vocabulary file, and upload it as part of your model to BigQuery ML.
B
Create a new view with BigQuery that does not include a column with city information.
C
Use Cloud Data Fusion to assign each city to a region labeled as 1, 2, 3, 4, or 5, and then use that number to represent the city in the model.
D
Use Dataprep to transform the state column using a one-hot encoding method, and make each city a column with binary values.