>>> import dateutil.parser, dateutil.tz as tz
>>> dateutil.parser.parse('2017-08-09 10:45 am').replace(tzinfo=tz.gettz('America/New_York'))
datetime.datetime(2017, 8, 9, 10, 45, tzinfo=tzfile('/usr/share/zoneinfo/America/New_York'))
Is that really the way that we're supposed to set a default timezone for parsing? I've read the documentation for the parser and examples but I cannot seem to find anything that says, "This is how to set the default timezone for dateutil.parser.parse", or even anything like it.
Because while this works, there are cases where it would do the wrong thing, if the zone were provided. Does that mean we should do this?
>>> d = dateutil.parser.parse('2017-08-09 10:45 am +06:00')
>>> d = d.replace(tzinfo=d.tzinfo or tz.gettz('America/Chicago'))
Because that's clunky, too.
What's the recommended way to set a default timezone when parsing?
There are basically two "correct" ways to do this. You can see that this was brought up as Issue #94 on dateutil
's issue tracker, and "set a default time zone" is determined to be out of scope, since this is something that can be easily done with the information returned by the parser anyway (and thus no need to build it in to the parser itself). The two ways are:
Provide a default
date that has a time zone. If you don't care what the default
date is, you can just specify some date literal and be done with it. If you want the behavior to be basically the same as dateutil
's default behavior (replacing missing elements from "today's date at midnight"), you have to have a bit of boilerplate:
from datetime import datetime, time
from dateutil import tz, parser
default_date = datetime.combine(datetime.now(),
time(0, tzinfo=tz.gettz("America/New_York")))
dt = parser.parse(some_dt_str, default=default_date)
Use your second method with .replace
:
from dateutil import parser
def my_parser(*args, default_tzinfo=tz.gettz("America/New_York"), **kwargs):
dt = parser.parse(*args, **kwargs)
return dt.replace(tzinfo=dt.tzinfo or default_tzinfo)
This last one is probably slightly cleaner than the first, but has a slight performance decrease if run in a tight loop (since the first one only needs the default date created once), but dateutil
's parser is actually quite slow, so an extra date construction is likely the least of your problems if you're running it in a tight loop.
Fleshing out Paul's comment - because a datetime
has to be at least a year, month, and day, dateutil already has a default that it uses:
>>> from datetime import datetime
>>> datetime.now()
datetime.datetime(2017, 10, 13, 15, 16, 13, 548750)
>>> dateutil.parser.parse('2017')
datetime.datetime(2017, 10, 13, 0, 0)
Given this, the appropriate choice would be to create a default
that contains the timezone and is either just the current date, or whatever date makes sense:
>>> dateutil.parser.parse('2017', default=datetime(2017, 10, 13, tzinfo=tz.gettz('America/New_York')))
Naturally you can store the default as something sensible, like default_datetime
or something, then it becomes:
>>> dateutil.parser.parse('2017', default=default_datetime)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With