Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python datetime.strptime() Eating lots of CPU Time

I have some log parsing code that needs to turn a timestamp into a datetime object. I am using datetime.strptime but this function is using a lot of cputime according to cProfile's cumtime column. The timestamps are in the format of 01/Nov/2010:07:49:33.

The current function is:

new_entry['time'] = datetime.strptime(
        parsed_line['day'] +
        parsed_line['month'] +
        parsed_line['year'] +
        parsed_line['hour'] +
        parsed_line['minute'] +
        parsed_line['second']
        , "%d%b%Y%H%M%S"
)

Anyone know how I might optimize this?

like image 563
Kyle Brandt Avatar asked Nov 01 '10 16:11

Kyle Brandt


1 Answers

If those are fixed width formats then there is no need to parse the line - you can use slicing and a dictionary lookup to get the fields directly.

month_abbreviations = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4,
                       'May': 5, 'Jun': 6, 'Jul': 7, 'Aug': 8,
                       'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
year = int(line[7:11])
month = month_abbreviations[line[3:6]]
day = int(line[0:2])
hour = int(line[12:14])
minute = int(line[15:17])
second = int(line[18:20])
new_entry['time'] = datetime.datetime(year, month, day, hour, minute, second)

Testing in the manner shown by Glenn Maynard shows this to be about 3 times faster.

like image 115
Mark Ransom Avatar answered Oct 27 '22 00:10

Mark Ransom