Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Parsing a date in python without using a default

I'm using python's dateutil.parser tool to parse some dates I'm getting from a third party feed. It allows specifying a default date, which itself defaults to today, for filling in missing elements of the parsed date. While this is in general helpful, there is no sane default for my use case, and I would prefer to treat partial dates as if I had not gotten a date at all (since it almost always means I got garbled data). I've written the following work around:

from dateutil import parser
import datetime

def parse_no_default(dt_str):
  dt = parser.parse(dt_str, default=datetime.datetime(1900, 1, 1)).date()
  dt2 = parser.parse(dt_str, default=datetime.datetime(1901, 2, 2)).date()
  if dt == dt2:
    return dt
    return None

(This snippet only looks at the date, as that's all I care about for my application, but similar logic could be extended to include the time component.)

I'm wondering (hoping) there's a better way of doing this. Parsing the same string twice just to see if it fills in different defaults seems like a gross waste of resources, to say the least.

Here's the set of tests (using nosetest generators) for the expected behavior:

import nose.tools
import lib.tools.date

def check_parse_no_default(sample, expected):
  actual = lib.tools.date.parse_no_default(sample)
  nose.tools.eq_(actual, expected)

def test_parse_no_default():
  cases = ( 
      ('2011-10-12', datetime.date(2011, 10, 12)),
      ('2011-10', None),
      ('2011', None),
      ('10-12', None),
      ('2011-10-12T11:45:30', datetime.date(2011, 10, 12)),
      ('10-12 11:45', None),
      ('', None),
  for sample, expected in cases:
    yield check_parse_no_default, sample, expected
like image 880
Mark Tozzi Avatar asked Dec 08 '11 17:12

Mark Tozzi

2 Answers

Depending on your domain following solution might work:

DEFAULT_DATE = datetime.datetime(datetime.MINYEAR, 1, 1)

def parse_no_default(dt_str):    
    dt = parser.parse(dt_str, default=DEFAULT_DATE).date()
    if dt != DEFAULT_DATE:
       return dt
       return None

Another approach would be to monkey patch parser class (this is very hackiesh, so I wouldn't recommend it if you have other options):

import dateutil.parser as parser
def parse(self, timestr, default=None,
          ignoretz=False, tzinfos=None,
    return self._parse(timestr, **kwargs)
parser.parser.parse = parse

You can use it as follows:

>>> ddd = parser.parser().parse('2011-01-02', None)
>>> ddd
_result(year=2011, month=01, day=02)
>>> ddd = parser.parser().parse('2011', None)
>>> ddd

By checking which members available in result (ddd) you could determine when return None. When all fields available you can convert ddd into datetime object:

# ddd might have following fields:
# "year", "month", "day", "weekday",
# "hour", "minute", "second", "microsecond",
# "tzname", "tzoffset"
datetime.datetime(ddd.year, ddd.month, ddd.day)
like image 179
ILYA Khlopotov Avatar answered Oct 21 '22 03:10

ILYA Khlopotov

This is probably a "hack", but it looks like dateutil looks at very few attributes out of the default you pass in. You could provide a 'fake' datetime that explodes in the desired way.

>>> import datetime
>>> import dateutil.parser
>>> class NoDefaultDate(object):
...     def replace(self, **fields):
...         if any(f not in fields for f in ('year', 'month', 'day')):
...             return None
...         return datetime.datetime(2000, 1, 1).replace(**fields)
>>> def wrap_parse(v):
...     _actual = dateutil.parser.parse(v, default=NoDefaultDate())
...     return _actual.date() if _actual is not None else None
>>> cases = (
...   ('2011-10-12', datetime.date(2011, 10, 12)),
...   ('2011-10', None),
...   ('2011', None),
...   ('10-12', None),
...   ('2011-10-12T11:45:30', datetime.date(2011, 10, 12)),
...   ('10-12 11:45', None),
...   ('', None),
...   )
>>> all(wrap_parse(test) == expected for test, expected in cases)
like image 28
SingleNegationElimination Avatar answered Oct 21 '22 03:10
