I have a dataframe, e.g:
Date B C
20.07.2018 10 8
20.07.2018 1 0
21.07.2018 0 1
21.07.2018 1 0
How can I count the zero and non-zero values for each column for each date?
Using .sum()
doesn't help me because it will sum the non-zero values.
e.g: expected output for the zero values:
Date B C
20.07.2018 0 1
21.07.2018 1 1
I believe need DataFrameGroupBy.agg
with compare by 0
and sum
:
a) To count no. of zero values:
df1 = df.groupby('Date').agg(lambda x: x.eq(0).sum())
print (df1)
B C
Date
20.07.2018 0 1
21.07.2018 1 1
b) To count no. of non-zero values:
df2 = df.groupby('Date').agg(lambda x: x.ne(0).sum())
print (df2)
B C
Date
20.07.2018 2 1
21.07.2018 1 1
Another idea for improve performance is create DatetimeIndex
, comapre columns and last use sum
per level (DatetimeIndex):
df1 = df.set_index('Date').eq(0).sum(level=0)
print (df1)
B C
Date
20.07.2018 0 1
21.07.2018 1 1
df2 = df.set_index('Date').ne(0).sum(level=0)
print (df2)
B C
Date
20.07.2018 2 1
21.07.2018 1 1
Try also:
df.groupby('Date').agg(lambda x: len(x) - x.astype(bool).sum(axis=0))
Output:
Out[48]:
B C
Date
20.07.2018 0 1
21.07.2018 1 1
Using melt
, then groupby
Newdf=df.melt('Date')
Newdf.value=Newdf.value.eq(0).astype(int)
Newdf.groupby(['Date','variable']).value.value_counts().unstack([1,2]).sort_index(level=0,axis=1)
Out[69]:
variable B C
value 0 1 0 1
Date
20.07.2018 NaN 2.0 1.0 1.0
21.07.2018 1.0 1.0 1.0 1.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With