Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keep other variables when executing get_dummies in Pandas

I have a DataFrame with an ID variable and another categorical variable. I want to create dummy variables out of the categorical variable with get_dummies.

dum = pd.get_dummies(df)

However, this makes the ID variable disappear. And I need this ID variable later on to merge to other data sets.

Is there a way to keep other variables. In the documentation of get_dummies I could not find anything. Thanks!

like image 924
Bert Carremans Avatar asked Dec 15 '22 04:12

Bert Carremans


2 Answers

You can also copy the original column into a new one before executing get_dummies. E.g.,

df['dum_orig'] = df['dum']
df = pd.get_dummies(df, columns=['dum'])
like image 133
Tom Avatar answered Mar 17 '23 06:03

Tom


I found the answer. You can concatenate the dummies data set to the original data set like shown below. As long as you don't re-order the data in the meantime.

df = pd.concat([df, dum], axis=1) 
like image 26
Bert Carremans Avatar answered Mar 17 '23 05:03

Bert Carremans