Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas: Passing Multiple Functions to agg() with Arguments

Tags:

python

pandas

I'm struggling to figure out how to combine two different syntaxes for pandas' dataframe.agg() function. Take this simple data frame:

df = pd.DataFrame({'A': ['group1', 'group1', 'group2', 'group2', 'group3', 'group3'],
                   'B': [10, 12, 10, 25, 10, 12],
                   'C': [100, 102, 100, 250, 100, 102]})

>>> df
[output]
        A   B    C
0  group1  10  100
1  group1  12  102
2  group2  10  100
3  group2  25  250
4  group3  10  100
5  group3  12  102

I know you can send two functions to agg() and get a new data frame where each function is applied to each column:

df.groupby('A').agg([np.mean, np.std])

[output]
           B                C            
        mean        std  mean         std
A                                        
group1  11.0   1.414214   101    1.414214
group2  17.5  10.606602   175  106.066017
group3  11.0   1.414214   101    1.414214

And I know you can pass arguments to a single function:

df.groupby('A').agg(np.std, ddof=0)

[output]
          B   C
A              
group1  1.0   1
group2  7.5  75
group3  1.0   1

But is there a way to pass multiple functions along with arguments for one or both of them? I was hoping to find something like df.groupby('A').agg([np.mean, (np.std, ddof=0)]) in the docs, but so far no luck. Any ideas?

like image 812
BringMyCakeBack Avatar asked Oct 14 '14 06:10

BringMyCakeBack


1 Answers

Well, the docs on aggregate are in fact a bit lacking. There might be a way to handle this with the correct passing of arguments, and you could look into the source code of pandas for that (perhaps I will later).

However, you could easily do:

df.groupby('A').agg([np.mean, lambda x: np.std(x, ddof=0)])

And it would work just as well.

like image 197
Korem Avatar answered Sep 29 '22 22:09

Korem