Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas.to_datetime function fails silently

I'm having some difficulty with pandas to_datetime function, and datetimes in general in pandas. Specifically, to_datetime is failing silently when applied to a pandas series, without doing anything, and I have to explicitly iterate over each value individually to get the function to work properly, even though (at least according to this SO question) both should work the same.

In [81]: np.__version__
Out[81]: '1.6.1'

In [82]: pd.__version__
Out[82]: '0.12.0'

In [83]: a[0:10]
Out[83]: 
0    8/31/2013 14:57:00
1    8/31/2013 13:55:00
2    8/31/2013 15:45:00
3     9/1/2013 13:26:00
4     9/1/2013 13:56:00
5     9/2/2013 13:55:00
6     9/3/2013 13:33:00
7     9/3/2013 14:11:00
8     9/3/2013 14:35:00
9     9/4/2013 14:28:00
Name: date_time, dtype: object

In [84]: a[0]
Out[84]: '8/31/2013 14:57:00'

In [85]: a=pd.to_datetime(a)

In [86]: a[0]
Out[86]: '8/31/2013 14:57:00'

In [87]: a=[pd.to_datetime(date) for date in a]

In [88]: a[0]
Out[88]: Timestamp('2013-08-31 14:57:00', tz=None)

Any thoughts about why this is? I seem to be having trouble in general with this data and the date_time column not being parsed correctly, and I suspect it may have something to do with this failure.

Thanks,

Dave

like image 578
DaveA Avatar asked Jan 14 '14 05:01

DaveA


2 Answers

This has been fixed in new pandas, the default errors kwarg is 'raise' rather than 'ignore'.
The new behavior is:

In [21]: pd.to_datetime(dates)  # same as errors='raise'
...
ValueError: Given date string not likely a datetime.

In [22]: pd.to_datetime(dates, errors="ignore")  # the original problem
Out[22]:
0    1/1/2014
1           A
dtype: object

That is, to_datetime no longer fails silently!

The old answer is kept below...


As DaveA points out (after checking my comment), by default to_datetime fails silently if there is an issue and returns what was originally passed:

In [1]: dates = pd.Series(['1/1/2014', 'A'])

In [2]: pd.to_datetime(dates)  # doesn't even convert first date
Out[2]: 
0    1/1/2014
1           A
dtype: object

In [3]: pd.to_datetime(dates, errors='raise')
...
ValueError: Given date string not likely a datetime.

Note: This argument used to be coerce=True in older pandas versions.

In [4]: pd.to_datetime(dates, errors='coerce')
Out[4]: 
0   2014-01-01
1          NaT
dtype: datetime64[ns]

This behavior to_datetime is discussed in the timeseries section of the docs.

You can see which dates failed to parse by checking isnull:

In [5]: dates[pd.isnull(pd.to_datetime(dates, errors='coerce'))]
Out[5]: 
1    A
dtype: object
like image 69
Andy Hayden Avatar answered Oct 23 '22 20:10

Andy Hayden


Pandas.to_datetime function fails silently, and simply returns the original value if it fails. One single malformed input can cause the entire process to fail silently.

like image 1
DaveA Avatar answered Oct 23 '22 18:10

DaveA