I have a dataframe where there are 2 date fields I want to filter and see rows when any one of the date field is null.
ID Date1 Date2
58844880 04/11/16 NaN
59745846 04/12/16 04/14/16
59743311 04/13/16 NaN
59745848 04/14/16 04/11/16
59598413 NaN NaN
59745921 04/14/16 04/14/16
59561199 04/15/16 04/15/16
NaN 04/16/16 04/16/16
59561198 NaN 04/17/16
It should look like below
ID Date1 Date2
58844880 04/11/16 NaN
59743311 04/13/16 NaN
59598413 NaN NaN
59561198 NaN 04/17/16
Tried the code
df = (df['Date1'].isnull() | df['Date1'].isnull())
Use boolean indexing
:
mask = df['Date1'].isnull() | df['Date2'].isnull()
print (df[mask])
ID Date1 Date2
0 58844880.0 04/11/16 NaN
2 59743311.0 04/13/16 NaN
4 59598413.0 NaN NaN
8 59561198.0 NaN 04/17/16
Timings:
#[900000 rows x 3 columns]
df = pd.concat([df]*100000).reset_index(drop=True)
In [12]: %timeit (df[df['Date1'].isnull() | df['Date2'].isnull()])
10 loops, best of 3: 89.3 ms per loop
In [13]: %timeit (df[df.filter(like='Date').isnull().any(1)])
10 loops, best of 3: 146 ms per loop
Quickly see if either column has any null values
df.isnull().any()
Count rows that have any null values
df.isnull().sum()
(1) Create truth table of null values (i.e. create dataframe with True/False in each column/cell, according to whether it has null value)
truth_table = df.isnull()
(2) Create truth table that shows conclusively which rows have any null values
conclusive_truth_table = truth_table.any(axis='columns')
(3) isolate/show rows that have any null values
df[conclusive_truth_table]
(1)-(3) put it all together
df[df.isnull().any(axis='columns')]
Isolate rows that have null values in any specified column
df.loc[:,['Date1','Date2']].isnull().any(axis='columns')
Isolate rows that have null values in BOTH specified columns
df[ df.loc[ :,['Date1','Date2'] ].isnull().sum(axis=1) == 2]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With