I have a dataframe that has months for columns, and various departments for rows.
2013April 2013May 2013June
Dep1 0 10 15
Dep2 10 15 20
I'm looking to add a column that counts the number of months that have a value greater than 0. Ex:
2013April 2013May 2013June Count>0
Dep1 0 10 15 2
Dep2 10 15 20 3
The number of columns this function needs to span is variable. I think defining a function then using .apply is the solution, but I can't seem to figure it out.
Use Sum Function to Count Specific Values in a Column in a Dataframe. We can use the sum() function on a specified column to count values equal to a set condition, in this case we use == to get just rows equal to our specific data point.
Use count() by Column NameUse pandas DataFrame. groupby() to group the rows by column and use count() method to get the count for each group by ignoring None and Nan values. It works with non-floating type data as well.
This particular syntax groups the rows of the DataFrame based on var1 and then counts the number of rows where var2 is equal to 'val. ' The following example shows how to use this syntax in practice.
first, pick your columns, cols
df[cols].apply(lambda s: (s > 0).sum(), axis=1)
this takes advantage of the fact that True
and False
are 1
and 0
respectively in python.
(df[cols] > 0).sum(1)
because this takes advantage of numpy vectorization
%timeit df.apply(lambda s: (s > 0).sum(), axis=1)
10 loops, best of 3: 141 ms per loop
%timeit (df > 0).sum(1)
1000 loops, best of 3: 319 µs per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With