I know that categorical data should be one-hot encoded before training the machine learning algorithm. I also need that for multivariate linear regression I need to exclude one of the encoded variable to avoid so called dummy variable trap.
Ex: If I have categorical feature "size": "small", "medium", "large", then in one hot encoded I would have something like:
small medium large other-feature
0 1 0 2999
So to avoid dummy variable trap I need to remove any of the 3 columns, for example, column "small".
Should I do the same for training a Neural Network? Or this is purely for multivariate regression?
Thanks.
As stated here, dummy variable trap needs to be avoided (one category of each categorical feature removed after encoding but before training) on input of algorithms that consider all the predictors together, as a linear combination. Such algorithms are:
If you remove a category from input of a neural network that employs weight decay, it will get biased in favor of the omitted category instead.
Even though no information is lost when omitting one category after encoding a feature, other algorithms will have to infer the correlation of the omitted category indirectly through combination of all the other categories, making them do more computation for the same result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With