
Ultimate access to all questions.
In a machine learning dataset, you have a categorical feature 'Marital Status' with possible values 'Single', 'Married', 'Divorced', and 'Widowed'. You have decided to use one-hot encoding to transform this feature. Explain the process of one-hot encoding and discuss why this approach can be inefficient for tree-based models.
A
One-hot encoding involves creating a new binary column for each unique value in the 'Marital Status' feature, resulting in four columns. This approach is inefficient for tree-based models because they cannot handle categorical features directly.
B
One-hot encoding involves creating a new binary column for each unique value in the 'Marital Status' feature, excluding one value to avoid multicollinearity. This approach is inefficient for tree-based models because it can lead to a large number of binary columns, increasing the dimensionality of the dataset.
C
One-hot encoding involves creating a new binary column for each unique value in the 'Marital Status' feature, resulting in three columns. This approach is inefficient for tree-based models because they can handle categorical features directly without the need for one-hot encoding.
D
One-hot encoding involves creating a new ordinal column for each unique value in the 'Marital Status' feature, assigning numerical values based on the order of the categories. This approach is inefficient for tree-based models because it introduces an arbitrary order to the categories.