I have got a dataframe with a column of datetime64 type. In this column there are several rows with dates as 1999-09-09 23:59:59 where as they should have actually been represented as missing dates NaT. Somebody just decided to use this particular date to represent the missing data. Now I want these dates to be replaced as NaT (the missing date type for Pandas).
Also if I perform operation on this column with NaTs, like
df['date'] - df['column with missing date']
Does Pandas ignore the missing dates and maintain NaT for those rows or will it throw an error some thing like Null pointer exception in Java.
In [6]:
import pandas as pd
df = pd.DataFrame({'date':[pd.datetime(1999,9,9,23,59,59), pd.datetime(2014,1,1)]* 10})
df
Out[6]:
date
0 1999-09-09 23:59:59
1 2014-01-01 00:00:00
2 1999-09-09 23:59:59
3 2014-01-01 00:00:00
4 1999-09-09 23:59:59
5 2014-01-01 00:00:00
6 1999-09-09 23:59:59
7 2014-01-01 00:00:00
8 1999-09-09 23:59:59
9 2014-01-01 00:00:00
10 1999-09-09 23:59:59
11 2014-01-01 00:00:00
12 1999-09-09 23:59:59
13 2014-01-01 00:00:00
14 1999-09-09 23:59:59
15 2014-01-01 00:00:00
16 1999-09-09 23:59:59
17 2014-01-01 00:00:00
18 1999-09-09 23:59:59
19 2014-01-01 00:00:00
In [9]:
import numpy as np
df.loc[df['date'] == '1999-09-09 23:59:59 ', 'date'] = pd.NaT
df
Out[9]:
date
0 NaT
1 2014-01-01
2 NaT
3 2014-01-01
4 NaT
5 2014-01-01
6 NaT
7 2014-01-01
8 NaT
9 2014-01-01
10 NaT
11 2014-01-01
12 NaT
13 2014-01-01
14 NaT
15 2014-01-01
16 NaT
17 2014-01-01
18 NaT
19 2014-01-01
To answer your second question most pandas functions handle NaN's appropriately, you can always just drop them:
In [10]:
df.dropna()
Out[10]:
date
1 2014-01-01
3 2014-01-01
5 2014-01-01
7 2014-01-01
9 2014-01-01
11 2014-01-01
13 2014-01-01
15 2014-01-01
17 2014-01-01
19 2014-01-01
and perform the operation just on these rows
There are some operations, especially between columns, that do not disconsider NaNs or NaTs. That is why you are getting NaTs as a result.
If you want to disconsider the 1999-09-09 23:59:59 and also have a subtractable column, try to convert to NaTs and then swap the NaTs with zeros (.fillna(0)
), so that, when subtracted, it will keep the value from the other column.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With