Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

drop records that have 3 or more columns with 0

Tags:

python

pandas

I have a dataframe that has a lot of 0, like the df example below. I would like to drop any row that has 0 in three or more columns, like the example Resultdf below.

The script below will drop any records that are all 0

df = df[(df.T != 0).any()]

Is there a way to modify it so it will drop records that are all 0, or that have three or more columns with 0? Or is there another way to do it?

print df:

ind_key prtCnt fldCnt TmCnt bmCnt
1       0      0      0     0
2       2      0      0     3
3       0      1      0     0
4       0      1      1     0

print Resultdf:

ind_key prtCnt fldCnt TmCnt bmCnt
2       2      0      0     3
4       0      1      1     0
like image 565
user3476463 Avatar asked Dec 24 '22 08:12

user3476463


2 Answers

You can using sum with axis = 1

df[df.eq(0).sum(1)<3] # eq mean '=='
Out[523]: 
   ind_key  prtCnt  fldCnt  TmCnt  bmCnt
1        2       2       0      0      3
3        4       0       1      1      0
like image 88
BENY Avatar answered Jan 06 '23 11:01

BENY


Use the idiomatic dropna with the thresh flag set:

df[df != 0].dropna(thresh=len(df.columns) -  2, axis=0)

   ind_key  prtCnt  fldCnt  TmCnt  bmCnt
1        2     2.0     NaN    NaN    3.0
3        4     NaN     1.0    1.0    NaN
like image 23
cs95 Avatar answered Jan 06 '23 10:01

cs95