Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Restore hierarchical column index when using groupby in pandas

Tags:

python

pandas

I am using groupby in pandas to compute some aggregates statistics in pandas on data where columns in a data frame are organized with a hierarchical index. For the computed statistics I want to get back to a table form in the end, where the groups are reconverted to columns with the group values, e.g. like:

index = pd.MultiIndex.from_tuples([('A', 'a'), ('B', 'b')])
df = pd.DataFrame(np.random.randn(8,2), columns=index)

which results in e.g. this data frame

          A         B
          a         b
0  0.511157  0.334748
1  0.031113 -0.477456
2  0.288080 -0.258238
3  0.138467 -0.955547
4 -0.087873  0.017494
5 -0.667393  1.190039
6 -0.068245 -1.282864
7 -0.996982  0.589667

Now I compute the statistics using groupby and reset the index to recreate a flat data frame:

df.groupby([('A','a')]).mean().reset_index()
     (A, a)         B
                    b
0 -0.996982  0.589667
1 -0.667393  1.190039
2 -0.087873  0.017494
3 -0.068245 -1.282864
4  0.031113 -0.477456
5  0.138467 -0.955547
6  0.288080 -0.258238
7  0.511157  0.334748

How can I achieve that ('A', 'a') becomes a part of the multi index again, hopefully in an automatic fashion? Or stated otherwise: is there a way to preserve the hierarchical column structure during the groupby operation.

like image 933
languitar Avatar asked Mar 13 '23 07:03

languitar


1 Answers

For me work add parameter as_index=False to groupby:

print df.groupby([('A','a')], as_index=False).mean()
          A         B
          a         b
0 -0.765088 -0.556601
1 -0.628040  2.074559
2 -0.516396 -2.028387
3 -0.152027  0.389853
4  0.450218  1.474989
5  0.718040 -0.882018
6  1.932556 -0.977316
7  2.028468 -0.875167
like image 127
jezrael Avatar answered May 03 '23 14:05

jezrael