Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summing over a multiindex level in a pandas series

Using the Pandas package in python, I would like to sum (marginalize) over one level in a series with a 3-level multiindex to produce a series with a 2 level multiindex. For example, if I have the following:

ind = [tuple(x) for x in ['ABC', 'ABc', 'AbC', 'Abc', 'aBC', 'aBc', 'abC', 'abc']] mi = pd.MultiIndex.from_tuples(ind) data = pd.Series([264, 13, 29, 8, 152, 7, 15, 1], index=mi)  A  B  C    264       c     13    b  C     29       c      8 a  B  C    152       c      7    b  C     15       c      1 

I would like to sum over the variable C to produce the following output:

A  B    277    b     37 a  B    159    b     16 

What is the best way in Pandas to do this?

like image 466
dylkot Avatar asked Jul 18 '14 13:07

dylkot


People also ask

How do you sum over rows in pandas?

To sum all the rows of a DataFrame, use the sum() function and set the axis value as 1. The value axis 1 will add the row values.

How do you sum all values in a pandas series?

sum() method is used to get the sum of the values for the requested axis. level[int or level name, default None] : If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a scalar.

How do I sum a range of columns in pandas?

sum() to Sum All Columns. Use DataFrame. sum() to get sum/total of a DataFrame for both rows and columns, to get the total sum of columns use axis=1 param. By default, this method takes axis=0 which means summing of rows.


1 Answers

If you know you always want to aggregate over the first two levels, then this is pretty easy:

In [27]: data.groupby(level=[0, 1]).sum() Out[27]: A  B    277    b     37 a  B    159    b     16 dtype: int64 
like image 130
chrisaycock Avatar answered Sep 23 '22 06:09

chrisaycock