I have list of start dates, and a list of end dates. They're sorted...
start_dates = [
    datetime.date(2009, 11, 5), datetime.date(2009, 11, 13), 
    datetime.date(2009, 11, 25), datetime.date(2009, 11, 26), 
    datetime.date(2009, 12, 4), datetime.date(2009, 12, 7), 
    datetime.date(2009, 12, 29), datetime.date(2009, 12, 30)]
end_dates = [
    datetime.date(2009, 10, 1), datetime.date(2009, 10, 2), 
    datetime.date(2009, 10, 9), datetime.date(2009, 10, 12), 
    datetime.date(2009, 11, 4), datetime.date(2009, 12, 14), 
    datetime.date(2009, 12, 15)]
The start dates represent dates on which we received advice to buy a share. The end dates are dates on which which we received advice to sell it. The sources of advice are different and we are backtesting what would happen if we used the buy advice from one source, but the sell advice from another. Thus we have two sequences of dates which we want to resolve into pairs - or intervals - over which we would have held a position in the stock.
Thus we'll take a date from the start_dates to decide when to buy the stock: on Nov 5 we buy a position. Then we cycle through the end_dates looking for the first advice to sell it: Dec 14. And repeat, buying when we don't hold a position when advised to from one source, and selling when we do hold a position from the other source.
You might say we want to flap between which of two lists we're looping over.
Thus the input above generates:
result = (
  (datetime.date(2009, 11, 5), datetime.date(2009, 12, 14)),
  (datetime.date(2009, 12, 29), None)
)
I'm using for-loops inside for-loops and wonder if there is not a better way. Performance is of interest since it will be applied to thousands of scenarios over a 40 year span; some of the lists involve thousands of dates.
I've finally nailed it down to:
    trades = []
    enddate = datetime.date(1900, 1, 1)
    try:
        for startdate in startdates:
            if enddate <= startdate:
                enddate = enddates.next()
                while enddate <= startdate:
                    enddate = enddates.next()
                trades.append((startdate, enddate))
    except StopIteration:
        trades.append((startdate, None))
Thanks to those who asked questions and answered. For no rational reason this little puzzle became a fixation for me but I think finally I have done this to death and should move on with my life. It's really very simple in the end - astonishing how much work it took to make it this plain!
Edit
This should scale with len(start_dates)+len(end_dates):
def date_range(start_dates, end_dates):
    result = []
    start_i = 0
    end_i = 0
    while start_i<len(start_dates):
        while end_i<len(end_dates) and start_dates[start_i]>end_dates[end_i]:
            end_i += 1
        if end_i == len(end_dates):
            result.append((start_dates[start_i], None))
            break
        result.append((start_dates[start_i], end_dates[end_i]))
        while start_i<len(start_dates) and start_dates[start_i]<=end_dates[end_i]:
            start_i += 1
        end_i += 1
    return result
Usage:
In  : start_dates = [
   ....:     datetime.date(2009, 11, 5), datetime.date(2009, 11, 13),
   ....:     datetime.date(2009, 11, 25), datetime.date(2009, 11, 26),
   ....:     datetime.date(2009, 12, 4), datetime.date(2009, 12, 7),
   ....:     datetime.date(2009, 12, 29), datetime.date(2009, 12, 30)]
In : end_dates = [
   ....:     datetime.date(2009, 10, 1), datetime.date(2009, 10, 2),
   ....:     datetime.date(2009, 10, 9), datetime.date(2009, 10, 12),
   ....:     datetime.date(2009, 11, 4), datetime.date(2009, 12, 14),
   ....:     datetime.date(2009, 12, 15)]
In : date_range(start_dates, end_dates)
Out:
[(datetime.date(2009, 11, 5), datetime.date(2009, 12, 14)),
 (datetime.date(2009, 12, 29), None)]
In : start_dates = [
   ....:     datetime.date(2009, 11, 5), datetime.date(2009, 11, 13),
   ....:     datetime.date(2009, 11, 25), datetime.date(2009, 11, 26),
   ....:     datetime.date(2009, 12, 4), datetime.date(2009, 12, 7),
   ....:     datetime.date(2009, 12, 29), datetime.date(2009, 12, 30)]
In : end_dates = [
   ....:     datetime.date(2009, 10, 1), datetime.date(2009, 10, 2),
   ....:     datetime.date(2009, 10, 9), datetime.date(2009, 10, 12),
   ....:     datetime.date(2009, 11, 7), datetime.date(2009, 12, 14), # changed (2009, 11, 4) -> (2009, 11, 7)
   ....:     datetime.date(2009, 12, 15)]
In : date_range(start_dates, end_dates)
Out:
[(datetime.date(2009, 11, 5), datetime.date(2009, 11, 7)),
 (datetime.date(2009, 11, 13), datetime.date(2009, 12, 14)),
 (datetime.date(2009, 12, 29), None)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With