Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get MM-DD-YYYY from pandas Timestamp

dates seem to be a tricky thing in python, and I am having a lot of trouble simply stripping the date out of the pandas TimeStamp. I would like to get from 2013-09-29 02:34:44 to simply 09-29-2013

I have a dataframe with a column Created_date:

Name: Created_Date, Length: 1162549, dtype: datetime64[ns]`

I have tried applying the .date() method on this Series, eg: df.Created_Date.date(), but I get the error AttributeError: 'Series' object has no attribute 'date'

Can someone help me out?

like image 780
blaklaybul Avatar asked Oct 01 '13 00:10

blaklaybul


People also ask

How do I change date format to MM DD YYYY in pandas?

Please notice that you can also specify the output date format other than the default one, by using the dt. strftime() method. For example, you can choose to display the output date as MM/DD/YYYY by specifying dt. strftime('%m/%d/%Y') .

What is PD timestamp?

Timestamp is the pandas equivalent of python's Datetime and is interchangeable with it in most cases. It's the type used for the entries that make up a DatetimeIndex, and other timeseries oriented data structures in pandas. Parameters ts_inputdatetime-like, str, int, float.

How do I change the date format in pandas?

Function usedstrftime() can change the date format in python.


1 Answers

map over the elements:

In [239]: from operator import methodcaller

In [240]: s = Series(date_range(Timestamp('now'), periods=2))

In [241]: s
Out[241]:
0   2013-10-01 00:24:16
1   2013-10-02 00:24:16
dtype: datetime64[ns]

In [238]: s.map(lambda x: x.strftime('%d-%m-%Y'))
Out[238]:
0    01-10-2013
1    02-10-2013
dtype: object

In [242]: s.map(methodcaller('strftime', '%d-%m-%Y'))
Out[242]:
0    01-10-2013
1    02-10-2013
dtype: object

You can get the raw datetime.date objects by calling the date() method of the Timestamp elements that make up the Series:

In [249]: s.map(methodcaller('date'))

Out[249]:
0    2013-10-01
1    2013-10-02
dtype: object

In [250]: s.map(methodcaller('date')).values

Out[250]:
array([datetime.date(2013, 10, 1), datetime.date(2013, 10, 2)], dtype=object)

Yet another way you can do this is by calling the unbound Timestamp.date method:

In [273]: s.map(Timestamp.date)
Out[273]:
0    2013-10-01
1    2013-10-02
dtype: object

This method is the fastest, and IMHO the most readable. Timestamp is accessible in the top-level pandas module, like so: pandas.Timestamp. I've imported it directly for expository purposes.

The date attribute of DatetimeIndex objects does something similar, but returns a numpy object array instead:

In [243]: index = DatetimeIndex(s)

In [244]: index
Out[244]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-10-01 00:24:16, 2013-10-02 00:24:16]
Length: 2, Freq: None, Timezone: None

In [246]: index.date
Out[246]:
array([datetime.date(2013, 10, 1), datetime.date(2013, 10, 2)], dtype=object)

For larger datetime64[ns] Series objects, calling Timestamp.date is faster than operator.methodcaller which is slightly faster than a lambda:

In [263]: f = methodcaller('date')

In [264]: flam = lambda x: x.date()

In [265]: fmeth = Timestamp.date

In [266]: s2 = Series(date_range('20010101', periods=1000000, freq='T'))

In [267]: s2
Out[267]:
0    2001-01-01 00:00:00
1    2001-01-01 00:01:00
2    2001-01-01 00:02:00
3    2001-01-01 00:03:00
4    2001-01-01 00:04:00
5    2001-01-01 00:05:00
6    2001-01-01 00:06:00
7    2001-01-01 00:07:00
8    2001-01-01 00:08:00
9    2001-01-01 00:09:00
10   2001-01-01 00:10:00
11   2001-01-01 00:11:00
12   2001-01-01 00:12:00
13   2001-01-01 00:13:00
14   2001-01-01 00:14:00
...
999985   2002-11-26 10:25:00
999986   2002-11-26 10:26:00
999987   2002-11-26 10:27:00
999988   2002-11-26 10:28:00
999989   2002-11-26 10:29:00
999990   2002-11-26 10:30:00
999991   2002-11-26 10:31:00
999992   2002-11-26 10:32:00
999993   2002-11-26 10:33:00
999994   2002-11-26 10:34:00
999995   2002-11-26 10:35:00
999996   2002-11-26 10:36:00
999997   2002-11-26 10:37:00
999998   2002-11-26 10:38:00
999999   2002-11-26 10:39:00
Length: 1000000, dtype: datetime64[ns]

In [269]: timeit s2.map(f)
1 loops, best of 3: 1.04 s per loop

In [270]: timeit s2.map(flam)
1 loops, best of 3: 1.1 s per loop

In [271]: timeit s2.map(fmeth)
1 loops, best of 3: 968 ms per loop

Keep in mind that one of the goals of pandas is to provide a layer on top of numpy so that (most of the time) you don't have to deal with the low level details of the ndarray. So getting the raw datetime.date objects in an array is of limited use since they don't correspond to any numpy.dtype that is supported by pandas (pandas only supports datetime64[ns] [that's nanoseconds] dtypes). That said, sometimes you need to do this.

like image 50
Phillip Cloud Avatar answered Oct 15 '22 10:10

Phillip Cloud