I have a pandas dataframe. I need to convert some of the columns into rows. The dataframe has same data in the first two columns for every 3 rows. So, I need 6 more columns as you will see in my expected dataframe. I have the following dataframe:
shopCode Product Code Score
111 Apple 123 0.70
111 Apple 456 0.75
111 Apple 789 0.80
222 Orange 142 0.66
222 Orange 136 0.83
222 Orange 623 0.76
My expected dataframe is:
shopCode Product Code1 Code2 Code3 Score1 Score2 Score3
111 Apple 123 456 789 0.70 0.75 0.80
222 Orange 142 136 623 0.66 0.83 0.76
I tried to use
df.pivot(index=['shopCode', 'Product'], columns=['Code1', 'Code2', 'Code3', 'Score1', 'Score2', 'Score3'], values=['Code', 'Score'])
but it doesn't work.
Can be achieved with pandas groupby and dictionary merge.
df.groupby(['shopCode', 'Product']).apply(lambda x: pd.Series(
{
**{'Code'+str(i+1): t for i,t in enumerate(x.Code)},
**{'Score'+str(i+1): t for i,t in enumerate(x.Score)}
}
)).reset_index()
We using pivot_table
df=pd.pivot_table(df,index=['shopCode','Product'],columns=df.groupby(['shopCode','Product']).cumcount().add(1),values=['Code','Score'],aggfunc='sum')
df.columns=df.columns.map('{0[0]}{0[1]}'.format)
df
Out[112]:
Code1 Code2 Code3 Score1 Score2 Score3
shopCode Product
111 Apple 123 456 789 0.70 0.75 0.80
222 Orange 142 136 623 0.66 0.83 0.76
After reset_index
df.reset_index()
Out[113]:
shopCode Product Code1 Code2 Code3 Score1 Score2 Score3
0 111 Apple 123 456 789 0.70 0.75 0.80
1 222 Orange 142 136 623 0.66 0.83 0.76
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With