
Answer-first summary for fast verification
Answer: One-hot encoding creates binary variables for each category, which can lead to high dimensionality and sparse data.
One-hot encoding involves creating binary variables for each category of a categorical feature, which can lead to high dimensionality and sparse data. This can be inefficient for tree-based models, as it increases the number of features and can complicate the tree-building process. An alternative approach for tree-based models is to use ordinal encoding or target encoding, which map categories to numerical values based on their relationship with the target variable, reducing dimensionality and potentially improving model performance.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Describe the process of one-hot encoding categorical features. Explain why this method can be inefficient for tree-based models and suggest an alternative approach for handling categorical data in such models.
A
One-hot encoding creates binary variables for each category, which can lead to high dimensionality and sparse data.
B
One-hot encoding is efficient for all types of models, including tree-based models.
C
One-hot encoding should not be used for categorical data.
D
One-hot encoding is only suitable for linear models.
No comments yet.