Creating new binary columns from single string column in pandas

Tags:

pandas

I've seen this before and simply can't remember the function.

Say I have a column "Speed" and each row has 1 of these values:

'Slow', 'Normal', 'Fast'

How do I create a new dataframe with all my rows except the column "Speed" which is now 3 columns: "Slow" "Normal" and "Fast" which has all of my rows labeled with a 1 in whichever column the old "Speed" column was. So if I had:

print df['Speed'].ix[0]
> 'Normal'

I would not expect this:

print df['Normal'].ix[0]
>1

print df['Slow'].ix[0]
>0

381

asked Mar 24 '14 22:03

user1610719

2 Answers

You can do this easily with pd.get_dummies (docs):

In [37]: df = pd.DataFrame(['Slow', 'Normal', 'Fast', 'Slow'], columns=['Speed'])

In [38]: df
Out[38]:
    Speed
0    Slow
1  Normal
2    Fast
3    Slow

In [39]: pd.get_dummies(df['Speed'])
Out[39]:
   Fast  Normal  Slow
0     0       0     1
1     0       1     0
2     1       0     0
3     0       0     1

answered Sep 22 '22 10:09

joris

Here is one solution:

df['Normal'] = df.Speed.apply(lambda x: 1 if x == "Normal" else 0)
df['Slow'] = df.Speed.apply(lambda x: 1 if x == "Slow" else 0)
df['Fast'] = df.Speed.apply(lambda x: 1 if x == "Fast" else 0)

answered Sep 24 '22 10:09

aha

Related questions
                            
                                How to concatenate two html file bodies with BeautifulSoup?
                            
                                linalg.norm not taking axis argument
                            
                                Python count all possible combinations for a table
                            
                                How do you override BaseHTTPRequestHandler log_message() method to log to a file rather than to console (sys.stderr)?
                            
                                connecting to mysql db on amazon rds
                            
                                Cython: Should I use np.float_t rather than double for typed memory views
                            
                                Is it possible to access current object while doing list/dict comprehension in Python?
                            
                                Why does mixing types in Python struct.pack uses more space than needed?
                            
                                Pandas, how to filter a df to get unique entries?
                            
                                Pythonic syntax to concatenate the keys and values of a dictionary
                            
                                Fit data to all possible distributions and return the best fit [closed]
                            
                                Current value of generator
                            
                                'Zip' dictionary of lists in Python
                            
                                Django: How to allow a Suspicious File Operation / copy a file
                            
                                Matplotlib / python clickable points
                            
                                How to make a window with buttons in python
                            
                                cv2.imshow and cv2.imwrite
                            
                                ConfigParser.MissingSectionHeaderError when parsing rsyncd config file with global options
                            
                                Multi-index dataframe from sequence of dataframes [duplicate]
                            
                                Python: I'm trying to find the maximum difference between two elements in a list

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With