Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas groupby and boolean selection

I often end up doing things like this in pandas:

s2 = s1.groupby(level=1).sum()
s2 = s2[s2>25]

In words, I do some groupby operation and then want to keep only results that meet some condition for the result.

Is there a way to do with in one line? More specifically, is it possible to do this without creating the series and then doing the Boolean selection in a second step?

like image 552
itzy Avatar asked Oct 05 '17 16:10

itzy


Video Answer


1 Answers

Assuming s1 is a pandas.Series

  1. You can pass level to pd.Series.sum
  2. pd.Series.compress is handy

s2.sum(level=1).compress(lambda s: s.gt(25))

Assuming s1 is a pandas.DataFrame
And that there is a column names 'col'

s.sum(level=1).query('col > 25')
like image 67
piRSquared Avatar answered Sep 28 '22 15:09

piRSquared