Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Groupby apply function to count values greater than zero

Pandas Groupby apply function to count values greater than zero

I am using groupby and agg in the following manner:

df.groupby('group')['a'].agg({'mean' : np.mean, 'std' : np.std})

and I would like to also count the values above zero in the same column ['a']

the following line does the count as I want,

sum(x > 0 for x in df['a'])

but I can't get it work when applying to groupby.

Following an example for applying a pandas calculation to a groupby I tried:

df.groupby('group')['a'].apply(sum(x > 0 for x in df['a']))

but I get an error message: AttributeError: 'numpy.int32' object has no attribute 'module'

Can anybody please suggest how this might be done?

like image 957
rdh9 Avatar asked Mar 30 '14 23:03

rdh9


People also ask

How do you count values greater than 0 in pandas?

count_nonzero() function. It will return the count of True values in Series i.e. count of values greater than the given limit in the selected column.

How do you count values greater than the group in pandas?

x > x. mean() gives True if the element is larger than the mean and 0 otherwise, sum then counts the number of Trues.

How do you count after Groupby in pandas?

Use count() by Column Name Use pandas DataFrame. groupby() to group the rows by column and use count() method to get the count for each group by ignoring None and Nan values.

How do you use Groupby and count?

The GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country". The GROUP BY statement is often used with aggregate functions ( COUNT() , MAX() , MIN() , SUM() , AVG() ) to group the result-set by one or more columns.


1 Answers

Answer from the comments:

 .agg({'pos':lambda ts: (ts > 0).sum()}) # –  behzad.nouri Mar 31 at 0:00 

This is my contribution to the backlog of unanswered questions :) Credits to behzad.nouri

Update 2020 In the latest pandas version, you need to do the following:

 .agg(pos=lambda ts: (ts > 0).sum()) 

otherwise it will result in the following error:

SpecificationError: nested renamer is not supported
like image 123
Reblochon Masque Avatar answered Sep 23 '22 08:09

Reblochon Masque