Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: how to convert a cell with multiple values to multiple rows?

I have a DataFrame like this:

Name asn  count
Org1 asn1,asn2 1
org2 asn3      2
org3 asn4,asn5 5

I would like to convert my DataFrame to look like this:

Name asn  count
Org1 asn1 1
Org1 asn2 1 
org2 asn3 2
org3 asn4 5
Org3 asn5 5

I know used the following code to do it with two columns, but I am not sure how can I do it for three.

df2 = df.asn.str.split(',').apply(pd.Series)          
df2.index = df.Name                                   
df2 = df2.stack().reset_index('Name') 

Can anybody help?

like image 859
UserYmY Avatar asked Apr 13 '15 12:04

UserYmY


1 Answers

Carrying on from the same idea, you could set a MultiIndex for df2 and then stack. For example:

>>> df2 = df.asn.str.split(',').apply(pd.Series)
>>> df2.index = df.set_index(['Name', 'count']).index
>>> df2.stack().reset_index(['Name', 'count'])
   Name  count     0
0  Org1      1  asn1
1  Org1      1  asn2
0  org2      2  asn3
0  org3      5  asn4
1  org3      5  asn5

You can then rename the column and set an index of your choosing.

like image 84
Alex Riley Avatar answered Sep 28 '22 09:09

Alex Riley