This is an extension to this question, where OP wanted to know how to drop rows where the values in a single column are NaN.
I'm wondering how I can drop rows where the values in 2 (or more) columns are both NaN. Using the second answer's created Data Frame:
In [1]: df = pd.DataFrame(np.random.randn(10,3))  In [2]: df.ix[::2,0] = np.nan; df.ix[::4,1] = np.nan; df.ix[::3,2] = np.nan;  In [3]: df Out[3]:           0         1         2 0       NaN       NaN       NaN 1  2.677677 -1.466923 -0.750366 2       NaN  0.798002 -0.906038 3  0.672201  0.964789       NaN 4       NaN       NaN  0.050742 5 -1.250970  0.030561 -2.678622 6       NaN  1.036043       NaN 7  0.049896 -0.308003  0.823295 8       NaN       NaN  0.637482 9 -0.310130  0.078891       NaN   If I use the drop.na() command, specifically the drop.na(subset=[1,2]), then it completes an "or" type drop and leaves:
In[4]: df.dropna(subset=[1,2]) Out[4]:            0         1         2 1  2.677677 -1.466923 -0.750366 2       NaN  0.798002 -0.906038 5 -1.250970  0.030561 -2.678622 7  0.049896 -0.308003  0.823295   What I want is an "and" type drop, where it drops rows where there is an NaN in column index 1 and 2. This would leave:
          0         1         2 1  2.677677 -1.466923 -0.750366 2       NaN  0.798002 -0.906038 3  0.672201  0.964789       NaN 4       NaN       NaN  0.050742 5 -1.250970  0.030561 -2.678622 6       NaN  1.036043       NaN 7  0.049896 -0.308003  0.823295 8       NaN       NaN  0.637482 9 -0.310130  0.078891       NaN   where only the first row is dropped.
Any ideas?
EDIT: changed data frame values for consistency
By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .
When it comes to dropping null values in pandas DataFrames, pandas. DataFrame. dropna() method is your friend. When you call dropna() over the whole DataFrame without specifying any arguments (i.e. using the default behaviour) then the method will drop all rows with at least one missing value.
To drop all the rows with the NaN values, you may use df. dropna().
Any one of the following two:
df.dropna(subset=[1, 2], how='all')   or
df.dropna(subset=[1, 2], thresh=1) 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With