Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to remove rows that contain NaN in both 1st and 3rd columns?

When dataframe is like this,

      a    b     c     d
0   1.0  NaN   3.0   NaN
1   NaN  6.0   NaN   8.0
2   9.0  NaN   NaN   NaN
3  13.0  NaN  15.0  16.0

I want to remove rows that contain NaN in both b and d columns. So I want the result to be like this.

      a    b     c     d
1   NaN  6.0   NaN   8.0
3  13.0  NaN  15.0  16.0

In this situation I can't use df.dropna(thresh=2) because I don't want to erase row 1,
and if I use df.dropna(subset=['b', 'd']) then row 3 will be removed too.
What should I do now?

like image 283
June Yoon Avatar asked Sep 15 '25 21:09

June Yoon


2 Answers

dropna has an additional parameter, how:

how{‘any’, ‘all’}, default ‘any’
    Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
        ‘any’ : If any NA values are present, drop that row or column.
        ‘all’ : If all values are NA, drop that row or column.

If you set it to all, it will only drop the lines that are filled with NaN. In your case df.dropna(subset=['b', 'd'], how="all") would work.

like image 114
Battleman Avatar answered Sep 19 '25 05:09

Battleman


you could do

df = df[df[['b', 'd']].notna().any(axis=1)]
like image 37
Ayoub ZAROU Avatar answered Sep 19 '25 05:09

Ayoub ZAROU