
Explanation:
Correct Option: C. One-hot encoding
One-hot encoding is the most appropriate technique for transforming categorical data into a numerical format that machine learning algorithms can process. It involves creating a binary column for each category, where a '1' signifies the presence of the category and a '0' indicates its absence. This method preserves the information in the categorical variables without introducing an ordinal relationship where none exists.
Why other options are incorrect:
Ultimate access to all questions.
In the context of preparing data for machine learning models, you are working with a dataset that includes categorical variables such as 'color' with values like 'red', 'blue', and 'green'. The dataset also contains numerical features. Your goal is to preprocess this data to ensure it is suitable for a machine learning algorithm that requires numerical input. Considering the need for scalability and the preservation of information, which of the following techniques should you employ to transform the categorical data into a format suitable for machine learning algorithms? Choose one correct option.
A
Data augmentation
B
Normalization
C
One-hot encoding
D
Standardization
No comments yet.