Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given a date range how can we break it up into N contiguous sub-intervals?

Tags:

I am accessing some data through an API where I need to provide the date range for my request, ex. start='20100101', end='20150415'. I thought I would speed this up by breaking up the date range into non-overlapping intervals and use multiprocessing on each interval.

My problem is that how I am breaking up the date range is not consistently giving me the expected result. Here is what I have done:

from datetime import date  begin = '20100101' end = '20101231' 

Suppose we wanted to break this up into quarters. First I change the string into dates:

def get_yyyy_mm_dd(yyyymmdd):     # given string 'yyyymmdd' return (yyyy, mm, dd)     year = yyyymmdd[0:4]     month = yyyymmdd[4:6]     day = yyyymmdd[6:]     return int(year), int(month), int(day)  y1, m1, d1 = get_yyyy_mm_dd(begin) d1 = date(y1, m1, d1) y2, m2, d2 = get_yyyy_mm_dd(end) d2 = date(y2, m2, d2) 

Then divide this range into sub-intervals:

def remove_tack(dates_list):     # given a list of dates in form YYYY-MM-DD return a list of strings in form 'YYYYMMDD'     tackless = []     for d in dates_list:         s = str(d)         tackless.append(s[0:4]+s[5:7]+s[8:])     return tackless  def divide_date(date1, date2, intervals):     dates = [date1]     for i in range(0, intervals):         dates.append(dates[i] + (date2 - date1)/intervals)     return remove_tack(dates) 

Using begin and end from above we get:

listdates = divide_date(d1, d2, 4) print listdates # ['20100101', '20100402', '20100702', '20101001', '20101231'] looks correct 

But if instead I use the dates:

begin = '20150101' end = '20150228' 

...

listdates = divide_date(d1, d2, 4) print listdates # ['20150101', '20150115', '20150129', '20150212', '20150226'] 

I am missing two days at the end of February. I don't need time or timezone for my application and I don't mind installing another library.

like image 657
Scott Avatar asked Apr 18 '15 18:04

Scott


People also ask

How do you split a date range in Python?

Method #1 : Using loop In this, we compute each segment duration using division of whole duration by N. Post that, each date is built using segment duration multiplication in loop.


2 Answers

I would actually follow a different approach and rely on timedelta and date addition to determine the non-overlapping ranges

Implementation

def date_range(start, end, intv):     from datetime import datetime     start = datetime.strptime(start,"%Y%m%d")     end = datetime.strptime(end,"%Y%m%d")     diff = (end  - start ) / intv     for i in range(intv):         yield (start + diff * i).strftime("%Y%m%d")     yield end.strftime("%Y%m%d") 

Execution

>>> begin = '20150101' >>> end = '20150228' >>> list(date_range(begin, end, 4)) ['20150101', '20150115', '20150130', '20150213', '20150228'] 
like image 192
Abhijit Avatar answered Sep 27 '22 15:09

Abhijit


you should change date for datetime

from datetime import date, datetime, timedelta  begin = '20150101' end = '20150228'  def get_yyyy_mm_dd(yyyymmdd):   # given string 'yyyymmdd' return (yyyy, mm, dd)   year = yyyymmdd[0:4]   month = yyyymmdd[4:6]   day = yyyymmdd[6:]   return int(year), int(month), int(day)  y1, m1, d1 = get_yyyy_mm_dd(begin) d1 = datetime(y1, m1, d1) y2, m2, d2 = get_yyyy_mm_dd(end) d2 = datetime(y2, m2, d2)  def remove_tack(dates_list):   # given a list of dates in form YYYY-MM-DD return a list of strings in form 'YYYYMMDD'   tackless = []   for d in dates_list:     s = str(d)     tackless.append(s[0:4]+s[5:7]+s[8:])   return tackless  def divide_date(date1, date2, intervals):   dates = [date1]   delta = (date2-date1).total_seconds()/4   for i in range(0, intervals):     dates.append(dates[i] + timedelta(0,delta))   return remove_tack(dates)  listdates = divide_date(d1, d2, 4) print listdates 

result:

['20150101 00:00:00', '20150115 12:00:00', '20150130 00:00:00', '20150213 12:00:00', '20150228 00:00:00']

like image 24
Jose Ricardo Bustos M. Avatar answered Sep 27 '22 16:09

Jose Ricardo Bustos M.