python pandas: split comma-separated column into new columns - one per value

Tags:

I have a dataframe like this:

data = np.array([["userA","event2, event3"],
            ['userB',"event3, event4"],
            ['userC',"event2"]])

data = pd.DataFrame(data)

        0         1
0   userA   "event2, event3"
1   userB   "event3, event4"
2   userC   "event2"

now I would like to get a dataframe like this:

       0    event2      event3      event4
0   userA     1           1
1   userB                 1           1
2   userC     1

can anybody help please?

563

asked Feb 16 '18 08:02

funkfux

1 Answers

It seems you need get_dummies with replace 0 to empty strings:

df = data[[0]].join(data[1].str.get_dummies(', ').replace(0, ''))
print (df)
       0 event2 event3 event4
0  userA      1      1       
1  userB             1      1
2  userC      1

Detail:

print (data[1].str.get_dummies(', '))
   event2  event3  event4
0       1       1       0
1       0       1       1
2       1       0       0

141

answered Sep 21 '22 15:09

jezrael

Related questions
                            
                                'argparse' with optional positional arguments that start with dash
                            
                                Django error: UNIQUE constraint failed: auth_user.username
                            
                                Python read csv with Hebrew header
                            
                                Numpy array to vtk table
                            
                                Why doesn't the last command variable "_" appear in dir()? [duplicate]
                            
                                Is it necessary to close the file in json.load?
                            
                                imgradient matlab equivalent in Python
                            
                                How to extract False Positive, False Negative from a confusion matrix of multiclass classification
                            
                                Python 3: unittest.mock how to specify different return values for specific inputs?
                            
                                'Image not found' Error After Installing OpenCV Python Wheel on Mac
                            
                                Randomly sampling lines from a file
                            
                                matplotlib: formatting of timestamp on x-axis
                            
                                How to build a N*(N+1) matrix with number in range of 1~N*N and totally distributed?
                            
                                How can I use numpy to create a diagonal matrix from a 1d array?
                            
                                How to count the number of reduced proper fractions fast enough?
                            
                                How to conditionally assign values to tensor [masking for loss function]?
                            
                                Prevent backtracking on regex to find non-comment lines (not starting with indented '#')
                            
                                Groupby multiple columns in a list
                            
                                Python Logging in Docker
                            
                                Python: PBS submission, what happens if I change script?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

python pandas: split comma-separated column into new columns - one per value

Tags:

python

pandas

data-cleaning

funkfux

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us