I am hitting on a corner case in pandas. I am trying to use the agg fn but without doing a groupby. Say I want an aggregation on the entire dataframe, i.e.
from pandas import *
DF = DataFrame( randn(5,3), index = list( "ABCDE"), columns = list("abc") )
DF.groupby([]).agg({'a' : np.sum, 'b' : np.mean } ) # <--- does not work
And DF.agg( {'a' ... } ) does not work either.
My workaround is to do DF['Total'] = 'Total' then do a DF.groupby(['Total']) but this seems a bit artificial.
Has anyone got a cleaner solution?
It's not so great either, but for this case, if you pass a function returning True at least it wouldn't require changing df:
>>> from pandas import *
>>> df = DataFrame( np.random.randn(5,3), index = list( "ABCDE"), columns = list("abc") )
>>> df.groupby(lambda x: True).agg({'a' : np.sum, 'b' : np.mean } )
a b
True 1.836649 -0.692655
>>>
>>> df['total'] = 'total'
>>> df.groupby(['total']).agg({'a' : np.sum, 'b' : np.mean } )
a b
total
total 1.836649 -0.692655
You could use various builtins instead of lambda x: True but they're less explicit and only work accidentally.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With