When dataframe is like this,
a b c d
0 1.0 NaN 3.0 NaN
1 NaN 6.0 NaN 8.0
2 9.0 NaN NaN NaN
3 13.0 NaN 15.0 16.0
I want to remove rows that contain NaN in both b and d columns. So I want the result to be like this.
a b c d
1 NaN 6.0 NaN 8.0
3 13.0 NaN 15.0 16.0
In this situation I can't use df.dropna(thresh=2)
because I don't want to erase row 1,
and if I use df.dropna(subset=['b', 'd'])
then row 3 will be removed too.
What should I do now?
dropna
has an additional parameter, how
:
how{‘any’, ‘all’}, default ‘any’
Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
‘any’ : If any NA values are present, drop that row or column.
‘all’ : If all values are NA, drop that row or column.
If you set it to all
, it will only drop the lines that are filled with NaN. In your case df.dropna(subset=['b', 'd'], how="all")
would work.
you could do
df = df[df[['b', 'd']].notna().any(axis=1)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With