How to apply different aggregation functions to same column by using pandas Groupby

Tags:

python

pandas

It is clear when doing

 data.groupby(['A','B']).mean()

We get something multiindex by level 'A' and 'B' and one column with the mean of each group

how could I have the count(), std() simultaneously ?

so result looks like in a dataframe

A   B    mean   count   std

331

asked Jun 05 '15 19:06

Hello lad

1 Answers

The following should work:

data.groupby(['A','B']).agg([pd.Series.mean, pd.Series.std, pd.Series.count])

basically call agg and passing a list of functions will generate multiple columns with those functions applied.

Example:

In [12]:

df = pd.DataFrame({'a':np.random.randn(5), 'b':[0,0,1,1,2]})
df.groupby(['b']).agg([pd.Series.mean, pd.Series.std, pd.Series.count])
Out[12]:
          a                
       mean       std count
b                          
0 -0.769198  0.158049     2
1  0.247708  0.743606     2
2 -0.312705       NaN     1

You can also pass the string of the method names, the common ones work, some of the more obscure ones don't I can't remember which but in this case they work fine, thanks to @ajcr for the suggestion:

In [16]:
df = pd.DataFrame({'a':np.random.randn(5), 'b':[0,0,1,1,2]})
df.groupby(['b']).agg(['mean', 'std', 'count'])

Out[16]:
          a                
       mean       std count
b                          
0 -1.037301  0.790498     2
1 -0.495549  0.748858     2
2 -0.644818       NaN     1

111

answered Oct 15 '22 04:10

EdChum

Related questions
                            
                                How to specify boundary behavior for SciPy's interp1d
                            
                                Can Python's asyncio.coroutine be thought of as a generator?
                            
                                "scoring must return a number" cross_val_score error in scikit-learn
                            
                                Modified BPMF in PyMC3 using `LKJCorr` priors: PositiveDefiniteError using `NUTS`
                            
                                How do I document the Jupyter Notebook Profile startup?
                            
                                How do I change the serializer that my multiprocessing.mangers.BaseManager subclass uses to cPickle?
                            
                                GenericRelatedObjectManager not JSON serializable
                            
                                When to use train_test_split of scikit learn
                            
                                Only ignore stop words for ngram_range=1
                            
                                Is Python's file.write atomic?
                            
                                Flask debug mode when using sockets
                            
                                Only one process prints in unix, multiprocessing python
                            
                                install ipython for current python version 2.x
                            
                                Connect to FTP TLS 1.2 Server with ftplib
                            
                                Python: Sum the Values of Three Layer Dictionaries
                            
                                Linear Regression from Time Series Pandas
                            
                                What is the usage of third argument objtype in Python descriptor's __get__ [duplicate]
                            
                                How to pass python function as an argument to c++ function using Cython
                            
                                Pythonic way to process multiple for loops with different filters against the same list?
                            
                                Why can't I set SOAP headers in pysimplesoap?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With