Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python datetime formatted string to datetime object

I have a series of CSVs with a column containing a Python datetime-formatted string. Whilst parsing the CSV files (which could be tens of thousands of rows long), I want the date column to be converted from a string to an actual datetime object.

An example CSV row:

['0', '(2011, 12, 11, 15, 45, 20)', 'Arduino/libraries/dallas-temperature-control/'],

As you can see, the date is represented in the CSV in datetime format, but as a string.

I am looking for a fast way to build the datetime object without resorting to running it through datetime.strptime(row[1], "(%Y, %m, %d, %H, %M, %S)") - it seems counter-intuitive to have to interpret the date with strptime when it's ready to drop in as-is.

like image 950
Karl M.W. Avatar asked Feb 10 '23 02:02

Karl M.W.


2 Answers

You can use ast.literal_eval to convert the string to a tuple of integers:

>>> import ast
>>> ast.literal_eval('(2011, 12, 11, 15, 45, 20)')
(2011, 12, 11, 15, 45, 20)

You can then unpack this (see e.g. What does ** (double star) and * (star) do for parameters?) straight into the datetime constructor:

>>> import datetime
>>> datetime.datetime(*ast.literal_eval('(2011, 12, 11, 15, 45, 20)'))
datetime.datetime(2011, 12, 11, 15, 45, 20)
like image 90
jonrsharpe Avatar answered Feb 13 '23 20:02

jonrsharpe


Like @jonrhsarpe has said in his answer, you can use ast.literal_eval to convert the string to a tuple and then unpack it into the string.

But based on the following tests, it seems like the faster method would still be to use datetime.datetime.strptime(). Example -

Code -

import datetime
import ast

def func1(datestring):
    return datetime.datetime(*ast.literal_eval(datestring))

def func2(datestring):
    return datetime.datetime.strptime(datestring, '(%Y, %m, %d, %H, %M, %S)')

Timing information -

In [39]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 30.1 µs per loop

In [40]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 26.9 µs per loop

In [41]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 38.6 µs per loop

In [42]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 28.8 µs per loop

In [43]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 31.2 µs per loop

In [44]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 29.5 µs per loop

In [45]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
The slowest run took 5.51 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 32.6 µs per loop

In [46]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
The slowest run took 15.42 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 27.5 µs per loop

In [47]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 49.2 µs per loop

In [48]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 24.4 µs per loop

Not sure, where you got the information that datetime.datetime.strptime() is counter-intuitive, but I would say for parsing strings to datetime objects, you should use strptime() .

like image 20
Anand S Kumar Avatar answered Feb 13 '23 22:02

Anand S Kumar