Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas; tricky pivot table

I have a pandas dataframe that I need to reshape/pivot. How to do it just seems beyond me at the moment. The dataframe looks like this:

Ref Statistic Val1 Val2 Val3 Val4
 0   Mean       0    1    2    3
 0   Std        0.1  0.1  0.1  0.1
 1   Mean       0    1    2    3
 1   Std        0.1  0.1  0.1  0.1
 2   Mean       0    1    2    3
 2   Std        0.1  0.1  0.1  0.1

And I'm aiming to get to this:

Ref Values Mean Std
 0    Val1   0  0.1
 0    Val2   1  0.1
 0    Val3   2  0.1
 0    Val4   3  0.1
 1    Val1   0  0.1
 1    Val2   1  0.1
 1    Val3   2  0.1
 1    Val4   3  0.1
 2    Val1   0  0.1
 2    Val2   1  0.1
 2    Val3   2  0.1
 2    Val4   3  0.1

It looks like this requires more than one pivot or a combination of pivot and groupby, but I'm having no luck...

Any ideas?

like image 540
jramm Avatar asked Feb 13 '23 13:02

jramm


2 Answers

>>> df1 = pd.melt(df, value_vars=['Val1', 'Val2', 'Val3', 'Val4'],
...               id_vars=['Statistic', 'Ref'], var_name='Values')
>>> df1.pivot_table(values='value', rows=['Ref', 'Values'], cols='Statistic')
Statistic   Mean  Std
Ref Values           
0   Val1       0  0.1
    Val2       1  0.1
    Val3       2  0.1
    Val4       3  0.1
1   Val1       0  0.1
    Val2       1  0.1
    Val3       2  0.1
    Val4       3  0.1
2   Val1       0  0.1
    Val2       1  0.1
    Val3       2  0.1
    Val4       3  0.1

[12 rows x 2 columns]

if you do not want to have MultiIndex as above, you may use .reset_index method on the last data-frame;

like image 92
behzad.nouri Avatar answered Feb 16 '23 02:02

behzad.nouri


As an alternative to melt, you can set a MultiIndex and chain the stack and unstack commands:

import pandas
# from io import StringIO # python 3
from StringIO import StringIO # python 2

datastring = StringIO('''\
Ref  Statistic  Val1  Val2  Val3  Val4
 0   Mean       0    1    2    3
 0   Std        0.1  0.1  0.1  0.1
 1   Mean       0    1    2    3
 1   Std        0.1  0.1  0.1  0.1
 2   Mean       0    1    2    3
 2   Std        0.1  0.1  0.1  0.1
''')

df = pandas.read_table(datastring, sep='\s\s+', index_col=['Ref', 'Statistic'])
df.columns.names = ['Values']
df.stack(level='values').unstack(level='Statistic')

Statistic  Mean  Std
Ref Values                
0   Val1      0  0.1
    Val2      1  0.1
    Val3      2  0.1
    Val4      3  0.1
1   Val1      0  0.1
    Val2      1  0.1
    Val3      2  0.1
    Val4      3  0.1
2   Val1      0  0.1
    Val2      1  0.1
    Val3      2  0.1
    Val4      3  0.1
like image 38
Paul H Avatar answered Feb 16 '23 02:02

Paul H