Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas create date range at certain dates

I want to create a list (or array or whathever) of a given number of dates at monthly intervals.

Basically what I want is this

>>>some_function(start_date=date(2005, 5, 14), periods=4, freq='M')
['2005-05-14', '2005-06-14', '2005-07-14', '2005-08-14']

and if the day of the startmonth is close to end of the month I want this

>>>some_function(start_date=date(2007, 12, 31), periods=4, freq='M')
['2007-12-31', '2008-01-31', '2008-02-29', '2008-03-31']

I am aware of the pandas date_range function, however it produces this

pd.date_range(date(2005, 5, 14), periods=4, freq='M')
Out[1]: DatetimeIndex(['2005-05-31', '2005-06-30', '2005-07-31', '2005-08-31'],
          dtype='datetime64[ns]', freq='M')

i.e. it sets the month end as the day. Which is not what I want.

Obviously, this could be produced iterating over the number of periods, but this creates a hassle when the day of the startmonth is close to the last day of the month.

Does anybody know of a function producing this or is the method outlined above the only way?

like image 354
mortysporty Avatar asked Feb 02 '18 14:02

mortysporty


1 Answers

I think the behavior you're after is, you want a date range where all the dates are on the same day of the month as your start date, except use the last day of the month for months which have fewer days in the month.

You can do that by using pandas.DateOffset(months=1, day=day_of_month) as the freq argument, where day_of_month is the day of month you want each date to be on. This will automatically use the last day of the month for months whose last day is less than day_of_month.

In [68]: pandas.date_range('2005-05-14', periods=4, freq=pandas.DateOffset(months=1, day=14))
Out[68]: DatetimeIndex(['2005-05-14', '2005-06-14', '2005-07-14', '2005-08-14'], dtype='datetime64[ns]', freq='<DateOffset: day=14, months=1>')

In [69]: pandas.date_range('2007-12-31', periods=4, freq=pandas.DateOffset(months=1, day=31))
Out[69]: DatetimeIndex(['2007-12-31', '2008-01-31', '2008-02-29', '2008-03-31'], dtype='datetime64[ns]', freq='<DateOffset: day=31, months=1>')
like image 66
user3311658 Avatar answered Sep 17 '22 11:09

user3311658