I have a table with a column that has some NaN values in it:
A B C D
2 3 2 Nan
3 4 5 5
2 3 1 Nan
I'd like to get all rows where D = NaN. How can I do this?
Creating a df for illustration (containing Nan)
In [86]: df =pd.DataFrame({'a':[1,2,3],'b':[3,4,5],'c':[np.nan, 4,5]})
In [87]: df
Out[87]:
a b c
0 1 3 NaN
1 2 4 4
2 3 5 5
Checking which indices have null for column c
In [88]: pd.isnull(df['c'])
Out[88]:
0 True
1 False
2 False
Name: c, dtype: bool
Checking which indices dont have null for column c
In [90]: pd.notnull(df['c'])
Out[90]:
0 False
1 True
2 True
Name: c, dtype: bool
Selecting rows of df where c is not null
In [91]: df[pd.notnull(df['c'])]
Out[91]:
a b c
1 2 4 4
2 3 5 5
Selecting rows of df where c is null
In [93]: df[pd.isnull(df['c'])]
Out[93]:
a b c
0 1 3 NaN
Selecting rows of column c of df where c is not null
In [94]: df['c'][pd.notnull(df['c'])]
Out[94]:
1 4
2 5
Name: c, dtype: float64
For a solution that doesn't involve pandas, you can do something like:
goodind=np.where(np.sum(np.isnan(y),axis=1)==0)[0] #indices of rows non containing nans
(or the negation if you want rows with nan) and use the indices to slice data.
I am not sure sum
is the best way to combine booleans, but np.any
and np.all
don't seem to have a axis
parameter, so this is the best way I found.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With