python pandas groupby() result

Tags:

I have the following python pandas data frame:

df = pd.DataFrame( {    'A': [1,1,1,1,2,2,2,3,3,4,4,4],    'B': [5,5,6,7,5,6,6,7,7,6,7,7],    'C': [1,1,1,1,1,1,1,1,1,1,1,1]     } );  df     A  B  C 0   1  5  1 1   1  5  1 2   1  6  1 3   1  7  1 4   2  5  1 5   2  6  1 6   2  6  1 7   3  7  1 8   3  7  1 9   4  6  1 10  4  7  1 11  4  7  1

I would like to have another column storing a value of a sum over C values for fixed (both) A and B. That is, something like:

    A  B  C  D 0   1  5  1  2 1   1  5  1  2 2   1  6  1  1 3   1  7  1  1 4   2  5  1  1 5   2  6  1  2 6   2  6  1  2 7   3  7  1  2 8   3  7  1  2 9   4  6  1  1 10  4  7  1  2 11  4  7  1  2

I have tried with pandas groupby and it kind of works:

res = {} for a, group_by_A in df.groupby('A'):     group_by_B = group_by_A.groupby('B', as_index = False)     res[a] = group_by_B['C'].sum()

but I don't know how to 'get' the results from res into df in the orderly fashion. Would be very happy with any advice on this. Thank you.

698

asked Jul 16 '13 00:07

Simon Righley

1 Answers

Here's one way (though it feels this should work in one go with an apply, I can't get it).

In [11]: g = df.groupby(['A', 'B'])  In [12]: df1 = df.set_index(['A', 'B'])

The size groupby function is the one you want, we have to match it to the 'A' and 'B' as the index:

In [13]: df1['D'] = g.size()  # unfortunately this doesn't play nice with as_index=False # Same would work with g['C'].sum()  In [14]: df1.reset_index() Out[14]:     A  B  C  D 0   1  5  1  2 1   1  5  1  2 2   1  6  1  1 3   1  7  1  1 4   2  5  1  1 5   2  6  1  2 6   2  6  1  2 7   3  7  1  2 8   3  7  1  2 9   4  6  1  1 10  4  7  1  2 11  4  7  1  2

113

answered Dec 19 '22 14:12

Andy Hayden

Related questions
                            
                                Multiple select issue with a HABTM relationship using Rails 4
                            
                                Best practice for structuring libraries to be required in node.js
                            
                                How do I change the directory in Git Bash with Git for Windows?
                            
                                MongoDb: Benefit of using ObjectID vs a string containing an Id?
                            
                                CGBitMapContextCreate Method Causes Compiler Warning Xcode 5 not Xcode 4
                            
                                What is the difference between "Debug.Print" and "Console.WriteLine" in .NET? [duplicate]
                            
                                Finding a nonzero integer x where x == -x?
                            
                                writing a pytest function to check outputting to a file in python?
                            
                                unit-tests karma-runner/jasmine profiling
                            
                                Running an executable jar file built from a gradle based project
                            
                                Postgres drop table syntax error
                            
                                How to limit the length of text in a paragraph [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With