I'm struggling to figure out how to combine two different syntaxes for pandas' dataframe.agg()
function. Take this simple data frame:
df = pd.DataFrame({'A': ['group1', 'group1', 'group2', 'group2', 'group3', 'group3'],
'B': [10, 12, 10, 25, 10, 12],
'C': [100, 102, 100, 250, 100, 102]})
>>> df
[output]
A B C
0 group1 10 100
1 group1 12 102
2 group2 10 100
3 group2 25 250
4 group3 10 100
5 group3 12 102
I know you can send two functions to agg()
and get a new data frame where each function is applied to each column:
df.groupby('A').agg([np.mean, np.std])
[output]
B C
mean std mean std
A
group1 11.0 1.414214 101 1.414214
group2 17.5 10.606602 175 106.066017
group3 11.0 1.414214 101 1.414214
And I know you can pass arguments to a single function:
df.groupby('A').agg(np.std, ddof=0)
[output]
B C
A
group1 1.0 1
group2 7.5 75
group3 1.0 1
But is there a way to pass multiple functions along with arguments for one or both of them? I was hoping to find something like df.groupby('A').agg([np.mean, (np.std, ddof=0)])
in the docs, but so far no luck. Any ideas?
Well, the docs on aggregate are in fact a bit lacking. There might be a way to handle this with the correct passing of arguments, and you could look into the source code of pandas for that (perhaps I will later).
However, you could easily do:
df.groupby('A').agg([np.mean, lambda x: np.std(x, ddof=0)])
And it would work just as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With