Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does pandas return timestamps instead of datetime objects when calling pd.to_datetime()?

According to the manual, pd.to_datetime() should create a datetime object.

Instead, when I call pd.to_datetime("2012-05-14"), I get a timestamp object! Calling to_datetime() on that object finally gives me a datetime object.

In [1]: pd.to_datetime("2012-05-14")
Out[1]: Timestamp('2012-05-14 00:00:00', tz=None)

In [2]: t = pd.to_datetime("2012-05-14")
In [3]: t.to_datime()
Out[2]: datetime.datetime(2012, 5, 14, 0, 0)

Is there an explanation for this unexpected behaviour?

like image 790
Xiphias Avatar asked May 20 '14 08:05

Xiphias


1 Answers

A Timestamp object is the way pandas works with datetimes, so it is a datetime object in pandas. But you expected a datetime.datetime object.
Normally you should not care about this (it is just a matter of a different repr). As long as you are working with pandas, the Timestamp is OK. And even if you really want a datetime.datetime, most things will work (eg all methods), and otherwise you can use to_pydatetime to retrieve the datetime.datetime object.

The longer story:

  • pandas stores datetimes as data with type datetime64 in index/columns (this are not datetime.datetime objects). This is the standard numpy type for datetimes and is more performant than using datetime.datetime objects:

     In [15]: df = pd.DataFrame({'A':[dt.datetime(2012,1,1), dt.datetime(2012,1,2)]})
    
     In [16]: df.dtypes
     Out[16]:
     A    datetime64[ns]
     dtype: object
    
     In [17]: df.loc[0,'A']
     Out[17]: Timestamp('2012-01-01 00:00:00', tz=None)
    
  • when retrieving one value of such a datetime column/index, you will see a Timestamp object. This is a more convenient object to work with the datetimes (more methods, better representation, etc than the datetime64), and this is a subclass of datetime.datetime, and so has all methods of it.
like image 120
joris Avatar answered Sep 22 '22 14:09

joris