In Python how do I find all the missing days in a sorted list of dates?
using sets
>>> from datetime import date, timedelta
>>> d = [date(2010, 2, 23), date(2010, 2, 24), date(2010, 2, 25),
date(2010, 2, 26), date(2010, 3, 1), date(2010, 3, 2)]
>>> date_set = set(d[0] + timedelta(x) for x in range((d[-1] - d[0]).days))
>>> missing = sorted(date_set - set(d))
>>> missing
[datetime.date(2010, 2, 27), datetime.date(2010, 2, 28)]
>>>
Sort the list of dates and iterate over it, remembering the previous entry. If the difference between the previous and current entry is more than one day, you have missing days.
Here's one way to implement it:
from datetime import date, timedelta
from itertools import tee, izip
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
b.next()
return izip(a, b)
def missing_dates(dates):
for prev, curr in pairwise(sorted(dates)):
i = prev
while i + timedelta(1) < curr:
i += timedelta(1)
yield i
dates = [ date(2010, 1, 8),
date(2010, 1, 2),
date(2010, 1, 5),
date(2010, 1, 1),
date(2010, 1, 7) ]
for missing in missing_dates(dates):
print missing
Output:
2010-01-03
2010-01-04
2010-01-06
Performance is O(n*log(n)) where n is the number of days in the span when the input is unsorted. As your list is already sorted, it will run in O(n).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With