Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting number of zeros per row by Pandas DataFrame?

Tags:

python

pandas

Given a DataFrame I would like to compute number of zeros per each row. How can I compute it with Pandas?

This is presently what I ve done, this returns indices of zeros

def is_blank(x):     return x == 0   indexer = train_df.applymap(is_blank) 
like image 224
erogol Avatar asked Mar 24 '15 10:03

erogol


People also ask

How do I count the number of values in a row in pandas?

In Pandas, You can get the count of each row of DataFrame using DataFrame. count() method. In order to get the row count you should use axis='columns' as an argument to the count() method.

How do you count the number of zeros in a column?

Select a blank cell and type this formula =COUNTIF(A1:H8,0) into it, and press Enter key, now all the zero cells excluding blank cells are counted out. Tip: In the above formula, A1:H8 is the data range you want to count the zeros from, you can change it as you need.

How do you count the number of occurrences in pandas DataFrame?

Using the size() or count() method with pandas. DataFrame. groupby() will generate the count of a number of occurrences of data present in a particular column of the dataframe.


1 Answers

Use a boolean comparison which will produce a boolean df, we can then cast this to int, True becomes 1, False becomes 0 and then call count and pass param axis=1 to count row-wise:

In [56]:  df = pd.DataFrame({'a':[1,0,0,1,3], 'b':[0,0,1,0,1], 'c':[0,0,0,0,0]}) df Out[56]:    a  b  c 0  1  0  0 1  0  0  0 2  0  1  0 3  1  0  0 4  3  1  0 In [64]:  (df == 0).astype(int).sum(axis=1) Out[64]: 0    2 1    3 2    2 3    2 4    1 dtype: int64 

Breaking the above down:

In [65]:  (df == 0) Out[65]:        a      b     c 0  False   True  True 1   True   True  True 2   True  False  True 3  False   True  True 4  False  False  True In [66]:  (df == 0).astype(int) Out[66]:    a  b  c 0  0  1  1 1  1  1  1 2  1  0  1 3  0  1  1 4  0  0  1 

EDIT

as pointed out by david the astype to int is unnecessary as the Boolean types will be upcasted to int when calling sum so this simplifies to:

(df == 0).sum(axis=1) 
like image 59
EdChum Avatar answered Oct 11 '22 19:10

EdChum