Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse dates with any separator using python's strptime

I'm parsing a date using Python's datetime.strptime. The dates are in %Y/%m/%d format, but I want to be agnostic about the separator used for the dates - /, -, . and any others are all fine.

Right now, I'm using a try block, but it's a bit verbose.

Is there a simple way to accept any given separator for strptime?

like image 281
Brad Koch Avatar asked Jan 03 '14 22:01

Brad Koch


People also ask

How do you parse a date format in Python?

Python has a built-in method to parse dates, strptime . This example takes the string “2020–01–01 14:00” and parses it to a datetime object. The documentation for strptime provides a great overview of all format-string options.

What is the use of Strptime function in Python?

The strptime() function in Python is used to format and return a string representation of date and time. It takes in the date, time, or both as an input, and parses it according to the directives given to it. It raises ValueError if the string cannot be formatted according to the provided directives.


2 Answers

You'd have to use a regular expression to normalize the separators:

import re

datetimestring = re.sub('[-.:]', '/', datetimestring)

would replace any -, . or : character with a slash.

Alternatively, use the dateutil.parser.parse() function to handle arbitrary date-time formats; it is extremely flexible towards separators.

like image 164
Martijn Pieters Avatar answered Sep 20 '22 01:09

Martijn Pieters


You can use named groups and a named back-reference:

import re, datetime

txt='''\
1994-8-22
1970 1 1
2014/01/03
2000.1.1
1999 8 27
1991/02-01   bad date'''

for test in txt.splitlines():
    dt=re.match(r'\s*(?P<Y>\d\d\d\d)(?P<sep>\D)(?P<m>\d\d?)(?P=sep)(?P<d>\d\d?)', test)
    if dt:
        print '"{:10}"=>{:20}=>{}'.format(test, 
                                      dt.group(*'Ymd'),
                                      datetime.date(*map(int, dt.group(*'Ymd'))))

That allows you to make sure that the separators are flexible (any non digit) but consistent. ie, this is a good date: 1999/1/2 but this 1999-1/2 maybe is not.

Prints:

"1994-8-22 "=>('1994', '8', '22') =>1994-08-22
"1970 1 1  "=>('1970', '1', '1')  =>1970-01-01
"2014/01/03"=>('2014', '01', '03')=>2014-01-03
"2000.1.1  "=>('2000', '1', '1')  =>2000-01-01
"1999 8 27 "=>('1999', '8', '27') =>1999-08-27
like image 33
dawg Avatar answered Sep 21 '22 01:09

dawg