This is an extension to this question, where OP wanted to know how to drop rows where the values in a single column are NaN.
I'm wondering how I can drop rows where the values in 2 (or more) columns are both NaN. Using the second answer's created Data Frame:
In [1]: df = pd.DataFrame(np.random.randn(10,3)) In [2]: df.ix[::2,0] = np.nan; df.ix[::4,1] = np.nan; df.ix[::3,2] = np.nan; In [3]: df Out[3]: 0 1 2 0 NaN NaN NaN 1 2.677677 -1.466923 -0.750366 2 NaN 0.798002 -0.906038 3 0.672201 0.964789 NaN 4 NaN NaN 0.050742 5 -1.250970 0.030561 -2.678622 6 NaN 1.036043 NaN 7 0.049896 -0.308003 0.823295 8 NaN NaN 0.637482 9 -0.310130 0.078891 NaN
If I use the drop.na()
command, specifically the drop.na(subset=[1,2])
, then it completes an "or" type drop and leaves:
In[4]: df.dropna(subset=[1,2]) Out[4]: 0 1 2 1 2.677677 -1.466923 -0.750366 2 NaN 0.798002 -0.906038 5 -1.250970 0.030561 -2.678622 7 0.049896 -0.308003 0.823295
What I want is an "and" type drop, where it drops rows where there is an NaN
in column index 1 and 2. This would leave:
0 1 2 1 2.677677 -1.466923 -0.750366 2 NaN 0.798002 -0.906038 3 0.672201 0.964789 NaN 4 NaN NaN 0.050742 5 -1.250970 0.030561 -2.678622 6 NaN 1.036043 NaN 7 0.049896 -0.308003 0.823295 8 NaN NaN 0.637482 9 -0.310130 0.078891 NaN
where only the first row is dropped.
Any ideas?
EDIT: changed data frame values for consistency
By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .
When it comes to dropping null values in pandas DataFrames, pandas. DataFrame. dropna() method is your friend. When you call dropna() over the whole DataFrame without specifying any arguments (i.e. using the default behaviour) then the method will drop all rows with at least one missing value.
To drop all the rows with the NaN values, you may use df. dropna().
Any one of the following two:
df.dropna(subset=[1, 2], how='all')
or
df.dropna(subset=[1, 2], thresh=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With