I'm working with pandas in python, and I have a pandas Series object, that I can't for the life of me figure out. it essentially looks like this:
>>>print(series_object)
key1 key2 key3
First class 19438 Error1:0 117
16431 Error2:0 80
1 Error3:0 70
Second class 28039 Error4:0 65
Third class 2063 Error5:0 28
19439 Error6:0 25
Fourth class 25975 Error7:0 11
Fifth class 23111 Error8:0 7
1243 Error9:665 4
Error9:581 3
27525 Error10:0 3
1243 Error9:748 2
1247 Error11:65 2
1243 Error9:852 2
1247 Error11:66 2
Error11:70 1
Error11:95 1
Error11:181 1
Error11:102 1
Error11:160 1
I want a way to sum the values of this object where key2 matches, so that it changes series_object
to be:
>>>print(series_object)
key1 key2 key3
First class 19438 Error1:0 117
16431 Error2:0 80
1 Error3:0 70
Second class 28039 Error4:0 65
Third class 2063 Error5:0 28
19439 Error6:0 25
Fourth class 25975 Error7:0 11
Fifth class 23111 Error8:0 7
1243 Error9:665 11
27525 Error10:0 3
1247 Error11:65 9
I've tried a lot of different things, and in a normal array, this wouldn't be an issue for me, but the pandas series object is new and confusing me. Could anyone provide some help?
You can use groupby.
http://pandas.pydata.org/pandas-docs/stable/groupby.html#groupby-with-multiindex
In your case
series_object.groupby(level='key2').sum()
Or if you want to keep 'key1' information as well
series_object.groupby(level=['key1', 'key2']).sum()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With