Google Professional Machine Learning Engineer

Get started today

Ultimate access to all questions.

In the context of preparing data for a machine learning model, you are working with a dataset that includes categorical variables such as 'Product Category' with values like 'Electronics', 'Clothing', and 'Home Appliances'. The dataset also contains numerical features. Your goal is to preprocess this data to ensure optimal performance of a linear regression model, considering constraints like computational efficiency and the interpretability of the model. Which of the following techniques should you employ to manage the categorical data effectively? Choose the best option.

Real Exam

Data augmentation to artificially increase the size of the dataset by creating variations of the existing data points.

2.9%

Normalization to scale all numerical features to a range between 0 and 1, without addressing the categorical data.

Comments

Loading comments...

Both One-hot encoding for categorical variables and Standardization for numerical features to ensure all data is appropriately scaled and interpretable by the model.

44.1%