Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum the values of a series in pandas based on one of multiple keys?

Tags:

python

pandas

I'm working with pandas in python, and I have a pandas Series object, that I can't for the life of me figure out. it essentially looks like this:

>>>print(series_object)

key1              key2      key3                                                             
First class       19438     Error1:0       117
                  16431     Error2:0       80
                  1         Error3:0       70
Second class      28039     Error4:0       65
Third class       2063      Error5:0       28
                  19439     Error6:0       25
Fourth class      25975     Error7:0       11
Fifth class       23111     Error8:0       7
                  1243      Error9:665     4
                            Error9:581     3
                  27525     Error10:0      3
                  1243      Error9:748     2
                  1247      Error11:65     2
                  1243      Error9:852     2
                  1247      Error11:66     2
                            Error11:70     1
                            Error11:95     1
                            Error11:181    1
                            Error11:102    1
                            Error11:160    1

I want a way to sum the values of this object where key2 matches, so that it changes series_object to be:

>>>print(series_object)
key1              key2      key3                                                             
First class       19438     Error1:0       117
                  16431     Error2:0       80
                  1         Error3:0       70
Second class      28039     Error4:0       65
Third class       2063      Error5:0       28
                  19439     Error6:0       25
Fourth class      25975     Error7:0       11
Fifth class       23111     Error8:0       7
                  1243      Error9:665     11
                  27525     Error10:0      3
                  1247      Error11:65     9

I've tried a lot of different things, and in a normal array, this wouldn't be an issue for me, but the pandas series object is new and confusing me. Could anyone provide some help?

like image 334
echolocation Avatar asked Mar 18 '23 09:03

echolocation


1 Answers

You can use groupby.

http://pandas.pydata.org/pandas-docs/stable/groupby.html#groupby-with-multiindex

In your case

series_object.groupby(level='key2').sum()

Or if you want to keep 'key1' information as well

series_object.groupby(level=['key1', 'key2']).sum()
like image 97
Alex Avatar answered Apr 07 '23 08:04

Alex