Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas group by ALL functionality?

I'm using the pandas groupby+agg functionality to generate nice reports

aggs_dict = {'a':['mean', 'std'], 'b': 'size'}
df.groupby('year').agg(aggs_dict)

I would like to use the same aggs_dict on the entire dataframe as a single group, with no division to years, something like:

df.groupall().agg(aggs_dict)

or:

df.agg(aggs_dict)

But couldn't find any elegant way to do it.. Note that in my real code aggs_dict is quite complex so it's rather cumbersome to do:

df.a.mean()
df.a.std()
df.b.size()
....

am I missing something simple and nice?

like image 230
ihadanny Avatar asked Sep 07 '16 08:09

ihadanny


3 Answers

You could also use a function to directly group on:

 df.groupby(lambda x: True).agg(aggs_dict)
like image 188
Hervé Mignot Avatar answered Nov 17 '22 15:11

Hervé Mignot


Ami Tavory's answer is a great way to do it but just in case you wanted a solution that doesn't require creating new columns and deleting them afterwards you could do something like:

df.groupby([True]*len(df)).agg(aggs_dict) 
like image 9
bunji Avatar answered Nov 17 '22 13:11

bunji


You could add a dummy column:

df['dummy'] = 1

Then groupby + agg on it:

df.groupby('dummy').agg(aggs_dict)

and then delete it when you're done.

like image 6
Ami Tavory Avatar answered Nov 17 '22 15:11

Ami Tavory