I'm having some difficulty with pandas to_datetime function, and datetimes in general in pandas. Specifically, to_datetime is failing silently when applied to a pandas series, without doing anything, and I have to explicitly iterate over each value individually to get the function to work properly, even though (at least according to this SO question) both should work the same.
In [81]: np.__version__
Out[81]: '1.6.1'
In [82]: pd.__version__
Out[82]: '0.12.0'
In [83]: a[0:10]
Out[83]:
0 8/31/2013 14:57:00
1 8/31/2013 13:55:00
2 8/31/2013 15:45:00
3 9/1/2013 13:26:00
4 9/1/2013 13:56:00
5 9/2/2013 13:55:00
6 9/3/2013 13:33:00
7 9/3/2013 14:11:00
8 9/3/2013 14:35:00
9 9/4/2013 14:28:00
Name: date_time, dtype: object
In [84]: a[0]
Out[84]: '8/31/2013 14:57:00'
In [85]: a=pd.to_datetime(a)
In [86]: a[0]
Out[86]: '8/31/2013 14:57:00'
In [87]: a=[pd.to_datetime(date) for date in a]
In [88]: a[0]
Out[88]: Timestamp('2013-08-31 14:57:00', tz=None)
Any thoughts about why this is? I seem to be having trouble in general with this data and the date_time column not being parsed correctly, and I suspect it may have something to do with this failure.
Thanks,
Dave
This has been fixed in new pandas, the default errors kwarg is 'raise' rather than 'ignore'.
The new behavior is:
In [21]: pd.to_datetime(dates) # same as errors='raise'
...
ValueError: Given date string not likely a datetime.
In [22]: pd.to_datetime(dates, errors="ignore") # the original problem
Out[22]:
0 1/1/2014
1 A
dtype: object
That is, to_datetime
no longer fails silently!
The old answer is kept below...
As DaveA points out (after checking my comment), by default to_datetime fails silently if there is an issue and returns what was originally passed:
In [1]: dates = pd.Series(['1/1/2014', 'A'])
In [2]: pd.to_datetime(dates) # doesn't even convert first date
Out[2]:
0 1/1/2014
1 A
dtype: object
In [3]: pd.to_datetime(dates, errors='raise')
...
ValueError: Given date string not likely a datetime.
Note: This argument used to be coerce=True
in older pandas versions.
In [4]: pd.to_datetime(dates, errors='coerce')
Out[4]:
0 2014-01-01
1 NaT
dtype: datetime64[ns]
This behavior to_datetime
is discussed in the timeseries section of the docs.
You can see which dates failed to parse by checking isnull
:
In [5]: dates[pd.isnull(pd.to_datetime(dates, errors='coerce'))]
Out[5]:
1 A
dtype: object
Pandas.to_datetime function fails silently, and simply returns the original value if it fails. One single malformed input can cause the entire process to fail silently.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With