I have column Place in Pandas dataframe which looks like this:
**Place**
Berlin
Prague
Mexico
Prague
Mexico
...
I'd like to do the following:
is_Berlin is_Prague is_Mexico
1 0 0
0 1 0
0 0 1
0 1 0
0 0 1
I know I can create the columns separately:
df['is_Berlin'] = df['Place']
df['is_Prague'] = df['Place']
df['is_Mexico'] = df['Place']
And then create a dictionary for each column and apply a map function.
#Example just for is_Berlin column
d = {'Berlin': 1,'Prague': 0,'Mexico': 0}
df['is_Berlin'] = df['is_Berlin'].map(d)
But I find this somehow tedious and I believe there is nice pythonic way how do to it.
You can use str.get_dummies
and if need add this new columns to original DataFrame
, use concat
:
df1 = df.Place.str.get_dummies()
print df1
Berlin Mexico Prague
0 1 0 0
1 0 0 1
2 0 1 0
3 0 0 1
4 0 1 0
df1.columns = ['is_' + col for col in df1.columns]
print df1
is_Berlin is_Mexico is_Prague
0 1 0 0
1 0 0 1
2 0 1 0
3 0 0 1
4 0 1 0
df = pd.concat([df, df1], axis=1)
print df
Place is_Berlin is_Mexico is_Prague
0 Berlin 1 0 0
1 Prague 0 0 1
2 Mexico 0 1 0
3 Prague 0 0 1
4 Mexico 0 1 0
#if there is more columns, you can drop Place column
df = df.drop('Place', axis=1)
print df
is_Berlin is_Mexico is_Prague
0 1 0 0
1 0 0 1
2 0 1 0
3 0 0 1
4 0 1 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With