Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A faster strptime?

I have code which reads vast numbers of dates in 'YYYY-MM-DD' format. Parsing all these dates, so that it can add one, two, or three days then write back in the same format is slowing things down quite considerably.

 3214657   14.330    0.000  103.698    0.000 trade.py:56(effective)
 3218418   34.757    0.000   66.155    0.000 _strptime.py:295(_strptime)

 day = datetime.datetime.strptime(endofdaydate, "%Y-%m-%d").date()

Any suggestions how to speed it up a bit (or a lot)?

like image 775
John Mee Avatar asked Nov 20 '12 07:11

John Mee


People also ask

What does Strptime mean?

DESCRIPTION. The strptime() function converts the character string pointed to by buf to values which are stored in the tm structure pointed to by tm, using the format specified by format. The format is composed of zero or more directives.

What is the difference between Strftime and Strptime?

strptime is short for "parse time" where strftime is for "formatting time". That is, strptime is the opposite of strftime though they use, conveniently, the same formatting specification.

What is %B in Strptime?

%B - full month name. %c - preferred date and time representation. %C - century number (the year divided by 100, range 00 to 99) %d - day of the month (01 to 31)

Is Strptime thread safe?

strptime() isn't considered thread-safe. But, that yields the exceptions as it's run inside a thread.


2 Answers

Is factor 7 lot enough?

datetime.datetime.strptime(a, '%Y-%m-%d').date()       # 8.87us

datetime.date(*map(int, a.split('-')))                 # 1.28us

EDIT: great idea with explicit slicing:

datetime.date(int(a[:4]), int(a[5:7]), int(a[8:10]))   # 1.06us

that makes factor 8.

like image 174
eumiro Avatar answered Sep 30 '22 09:09

eumiro


Python 3.7+: fromisoformat()

Since Python 3.7, the datetime class has a method fromisoformat. It should be noted that this can also be applied to this question:

Performance vs. strptime()

Explicit string slicing may give you about a 9x increase in performance compared to normal strptime, but you can get about a 90x increase with the built-in fromisoformat method!

%timeit isofmt(datelist)
569 µs ± 8.45 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit slice2int(datelist)
5.51 ms ± 48.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit normalstrptime(datelist)
52.1 ms ± 1.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
from datetime import datetime, timedelta
base, n = datetime(2000, 1, 1, 1, 2, 3, 420001), 10000
datelist = [(base + timedelta(days=i)).strftime('%Y-%m-%d') for i in range(n)]

def isofmt(l):
    return list(map(datetime.fromisoformat, l))
    
def slice2int(l):   
    def slicer(t):
        return datetime(int(t[:4]), int(t[5:7]), int(t[8:10]))
    return list(map(slicer, l))

def normalstrptime(l):
    return [datetime.strptime(t, '%Y-%m-%d') for t in l]
    
print(isofmt(datelist[0:1]))
print(slice2int(datelist[0:1]))
print(normalstrptime(datelist[0:1]))

# [datetime.datetime(2000, 1, 1, 0, 0)]
# [datetime.datetime(2000, 1, 1, 0, 0)]
# [datetime.datetime(2000, 1, 1, 0, 0)]

Python 3.8.3rc1 x64 / Win10

like image 13
FObersteiner Avatar answered Sep 30 '22 10:09

FObersteiner