I have column Place in Pandas dataframe which looks like this:
**Place**
Berlin
Prague
Mexico
Prague
Mexico
...
I'd like to do the following:
is_Berlin   is_Prague   is_Mexico
1           0           0
0           1           0
0           0           1
0           1           0
0           0           1   
I know I can create the columns separately:
df['is_Berlin'] = df['Place']
df['is_Prague'] = df['Place']
df['is_Mexico'] = df['Place']
And then create a dictionary for each column and apply a map function.
#Example just for is_Berlin column
d = {'Berlin': 1,'Prague': 0,'Mexico': 0} 
df['is_Berlin'] = df['is_Berlin'].map(d)
But I find this somehow tedious and I believe there is nice pythonic way how do to it.
You can use str.get_dummies and if need add this new columns to original DataFrame, use concat:
df1 = df.Place.str.get_dummies()
print df1
   Berlin  Mexico  Prague
0       1       0       0
1       0       0       1
2       0       1       0
3       0       0       1
4       0       1       0
df1.columns = ['is_' + col for col in df1.columns]
print df1
   is_Berlin  is_Mexico  is_Prague
0          1          0          0
1          0          0          1
2          0          1          0
3          0          0          1
4          0          1          0
df = pd.concat([df, df1], axis=1)
print df
    Place  is_Berlin  is_Mexico  is_Prague
0  Berlin          1          0          0
1  Prague          0          0          1
2  Mexico          0          1          0
3  Prague          0          0          1
4  Mexico          0          1          0
#if there is more columns, you can drop Place column
df = df.drop('Place', axis=1)
print df
   is_Berlin  is_Mexico  is_Prague
0          1          0          0
1          0          0          1
2          0          1          0
3          0          0          1
4          0          1          0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With