Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I find missing dates in a list of sorted dates?

Tags:

In Python how do I find all the missing days in a sorted list of dates?

like image 481
Vishal Avatar asked Feb 22 '10 23:02

Vishal


2 Answers

using sets

>>> from datetime import date, timedelta
>>> d = [date(2010, 2, 23), date(2010, 2, 24), date(2010, 2, 25),
         date(2010, 2, 26), date(2010, 3, 1), date(2010, 3, 2)]
>>> date_set = set(d[0] + timedelta(x) for x in range((d[-1] - d[0]).days))
>>> missing = sorted(date_set - set(d))
>>> missing
[datetime.date(2010, 2, 27), datetime.date(2010, 2, 28)]
>>> 
like image 182
John La Rooy Avatar answered Oct 07 '22 19:10

John La Rooy


Sort the list of dates and iterate over it, remembering the previous entry. If the difference between the previous and current entry is more than one day, you have missing days.

Here's one way to implement it:

from datetime import date, timedelta
from itertools import tee, izip

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    b.next()
    return izip(a, b)

def missing_dates(dates):
    for prev, curr in pairwise(sorted(dates)):
        i = prev
        while i + timedelta(1) < curr:
            i += timedelta(1)
            yield i

dates = [ date(2010, 1, 8),
          date(2010, 1, 2),
          date(2010, 1, 5),
          date(2010, 1, 1),
          date(2010, 1, 7) ]

for missing in missing_dates(dates):
    print missing

Output:

2010-01-03
2010-01-04
2010-01-06

Performance is O(n*log(n)) where n is the number of days in the span when the input is unsorted. As your list is already sorted, it will run in O(n).

like image 37
Mark Byers Avatar answered Oct 07 '22 18:10

Mark Byers