Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert cells in dataframe with multiple values to multiple rows

Tags:

python

pandas

My data is like this:

Name    test1     test2      Count
Emp1    X,Y        A           1
Emp2    X          A,B,C       2
Emp3    Z          C           3

I'm using the code below to split test1 cells with multiple values to individual rows. However, I am not sure how to split Test2 column.

df2 = df.test1.str.split(',').apply(pd.Series)
df2.index = df.set_index(['Name', 'count']).index
df2.stack().reset_index(['Name', 'count'])
df2

And the output is:

Name    test1   Count
Emp1    X        1
Emp1    Y        1
Emp2    X        2
Emp2    X        2
Emp2    X        2
Emp2    Z        3

I'm trying to split test1 and test2 so that I can achieve this output:

Name    test1    test2  Count
Emp1    X          A      1
Emp1    Y          A      1
Emp2    X          A      2
Emp2    X          B      2
Emp2    X          C      2
Emp2    Z          C      3

Can anybody help, please?

like image 339
PieSquare Avatar asked Oct 17 '25 10:10

PieSquare


1 Answers

I am just fix your code , since I do not recommend the method you unnesting the dataframe , you can check the answer here, there are multiple nice way.

df2 = df.test1.str.split(',').apply(pd.Series)
df2.index = df.set_index(['Name', 'Count']).index
df2=df2.stack().reset_index(['Name', 'Count'])
df3 = df.test2.str.split(',').apply(pd.Series)
df3.index = df.set_index(['Name', 'Count']).index
df3=df3.stack().reset_index(['Name', 'Count'])

Just do merge here

df2.merge(df3,on=['Name', 'Count'],how='outer')
Out[132]: 
   Name  Count 0_x 0_y
0  Emp1      1   X   A
1  Emp1      1   Y   A
2  Emp2      2   X   A
3  Emp2      2   X   B
4  Emp2      2   X   C
5  Emp3      3   Z   C
like image 128
BENY Avatar answered Oct 19 '25 23:10

BENY



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!