Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python (Pandas) Add subtotal on each lvl of multiindex dataframe

Assuming I have the following dataframe:

a       b       c      Sce1     Sce2    Sce3    Sce4    Sce5    Sc6
Animal  Ground  Dog    0.0      0.9     0.5     0.0     0.3     0.4  
Animal  Ground  Cat    0.6      0.5     0.3     0.5     1.0     0.2 
Animal  Air     Eagle  1.0      0.1     0.1     0.6     0.9     0.1 
Animal  Air     Owl    0.3      0.1     0.5     0.3     0.5     0.9     
Object  Metal   Car    0.3      0.3     0.8     0.6     0.5     0.6 
Object  Metal   Bike   0.5      0.1     0.4     0.7     0.4     0.2 
Object  Wood    Chair  0.9      0.6     0.1     0.9     0.2     0.8 
Object  Wood    Table  0.9      0.6     0.6     0.1     0.9     0.7 

I want to create a MultiIndex, which will contain the sum of each lvl. The output will look like this:

a      b      c     Sce1    Sce2    Sce3    Sce4    Sce5    Sce6
Animal              1.9     1.6     1.4     1.3     2.7     1.6 
       Ground       0.6     1.4     0.8     0.5     1.3     0.6 
              Dog   0.0     0.9     0.5     0.0     0.3     0.4 
              Cat   0.6     0.5     0.3     0.5     1.0     0.2 
       Air          1.3     0.2     0.7     0.8     1.4     1.0 
              Eagle 1.0     0.1     0.1     0.6     0.9     0.1 
              Owl   0.3     0.1     0.5     0.3     0.5     0.9 
Object              2.6     1.6     1.8     2.3     2.0     2.3 
       Metal        0.8     0.3     1.1     1.3     0.9     0.8 
              Car   0.3     0.3     0.8     0.6     0.5     0.6 
              Bike  0.5     0.1     0.4     0.7     0.4     0.2 
       Wood         1.8     1.3     0.6     1.0     1.1     1.5 
              Chair 0.9     0.6     0.1     0.9     0.2     0.8 
              Table 0.9     0.6     0.6     0.1     0.9     0.7 

At the moment I am using a loop to create three different dataframes on each level and then manipulate them on excel, as below. So I wanted to take this calculation in python if possible.

for i in range range(0,3):
    df = df.groupby(list(df.columns)[0:lvl], as_index=False).sum()
    return df

Many thanks in advance.

like image 570
pbou Avatar asked Apr 05 '17 17:04

pbou


2 Answers

you need to do two group by to get subtotals at every level of aggregation. Then add those back in to the initial DF. Here's a related question.

like image 72
JD Long Avatar answered Oct 21 '22 06:10

JD Long


With some liberal use of MAGIC

pd.concat([
        df.assign(
            **{x: 'Total' for x in 'abc'[i:]}
        ).groupby(list('abc')).sum() for i in range(4)
    ]).sort_index()

                     Sce1  Sce2  Sce3  Sce4  Sce5  Sc6
a      b      c                                       
Animal Air    Eagle   1.0   0.1   0.1   0.6   0.9  0.1
              Owl     0.3   0.1   0.5   0.3   0.5  0.9
              Total   1.3   0.2   0.6   0.9   1.4  1.0
       Ground Cat     0.6   0.5   0.3   0.5   1.0  0.2
              Dog     0.0   0.9   0.5   0.0   0.3  0.4
              Total   0.6   1.4   0.8   0.5   1.3  0.6
       Total  Total   1.9   1.6   1.4   1.4   2.7  1.6
Object Metal  Bike    0.5   0.1   0.4   0.7   0.4  0.2
              Car     0.3   0.3   0.8   0.6   0.5  0.6
              Total   0.8   0.4   1.2   1.3   0.9  0.8
       Total  Total   2.6   1.6   1.9   2.3   2.0  2.3
       Wood   Chair   0.9   0.6   0.1   0.9   0.2  0.8
              Table   0.9   0.6   0.6   0.1   0.9  0.7
              Total   1.8   1.2   0.7   1.0   1.1  1.5
Total  Total  Total   4.5   3.2   3.3   3.7   4.7  3.9

I can get exactly what you asked for with

pd.concat([
        df.assign(
            **{x: '' for x in 'abc'[i:]}
        ).groupby(list('abc')).sum() for i in range(1, 4)
    ]).sort_index()

                     Sce1  Sce2  Sce3  Sce4  Sce5  Sc6
a      b      c                                       
Animal                1.9   1.6   1.4   1.4   2.7  1.6
       Air            1.3   0.2   0.6   0.9   1.4  1.0
              Eagle   1.0   0.1   0.1   0.6   0.9  0.1
              Owl     0.3   0.1   0.5   0.3   0.5  0.9
       Ground         0.6   1.4   0.8   0.5   1.3  0.6
              Cat     0.6   0.5   0.3   0.5   1.0  0.2
              Dog     0.0   0.9   0.5   0.0   0.3  0.4
Object                2.6   1.6   1.9   2.3   2.0  2.3
       Metal          0.8   0.4   1.2   1.3   0.9  0.8
              Bike    0.5   0.1   0.4   0.7   0.4  0.2
              Car     0.3   0.3   0.8   0.6   0.5  0.6
       Wood           1.8   1.2   0.7   1.0   1.1  1.5
              Chair   0.9   0.6   0.1   0.9   0.2  0.8
              Table   0.9   0.6   0.6   0.1   0.9  0.7

As for the how! I'll leave that as an exercise for the reader.

like image 26
piRSquared Avatar answered Oct 21 '22 06:10

piRSquared