I'm trying to understand how to use categorical data as features in sklearn.linear_model
's LogisticRegression
.
I understand of course I need to encode it.
What I don't understand is how to pass the encoded feature to the Logistic regression so it's processed as a categorical feature, and not interpreting the int value it got when encoding as a standard quantifiable feature.
(Less important) Can somebody explain the difference between using preprocessing.LabelEncoder()
, DictVectorizer.vocabulary
or just encoding the categorical data yourself with a simple dict? Alex A.'s comment here touches on the subject but not very deeply.
Especially with the first one!
In recent sklearn versions you can now use le. fit for categorical variables with more than two classes.
Yes, you can train a logistic regression model on categorical data. Each feature will be basically on/off which actually simplifies the things.
Logistic regression is a pretty flexible method. It can readily use as independent variables categorical variables. Most software that use Logistic regression should let you use categorical variables.
Suppose the type of each categorical variable is "object". Firstly, you can create an panda.index
of categorical column names:
import pandas as pd
catColumns = df.select_dtypes(['object']).columns
Then, you can create the indicator variables using a for-loop below. For the binary categorical variables, use the LabelEncoder()
to convert it to 0
and 1
. For categorical variables with more than two categories, use pd.getDummies()
to obtain the indicator variables and then drop one category (to avoid multicollinearity issue).
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
for col in catColumns:
n = len(df[col].unique())
if (n > 2):
X = pd.get_dummies(df[col])
X = X.drop(X.columns[0], axis=1)
df[X.columns] = X
df.drop(col, axis=1, inplace=True) # drop the original categorical variable (optional)
else:
le.fit(df[col])
df[col] = le.transform(df[col])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With