Is there any way to create a series of equally spaced date-time objects, given the start/stop epochs and the desired number of intervening elements?
t0 = dateutil.parser.parse("23-FEB-2015 23:09:19.445506")
tf = dateutil.parser.parse("24-FEB-2015 01:09:22.404973")
n = 10**4
series = pandas.period_range(start=t0, end=tf, periods=n)
This example fails, maybe pandas isn't intended to give date ranges with frequencies shorter than a day?
I could manually estimate a frequecy, i.e. (tf-t0)/n, but I'm concerned that naively adding this timedelta repeatedly (to the start epoch) will accumulate significant rounding errors as I approach the end epoch.
I could resort to working exclusively with floats instead of datetime objects. (For example, subtract the start epoch from the end epoch, and divide the timedelta by some unit such as a second, then simply apply numpy linspace..) But casting everything to floats (and converting back to dates only when needed) sacrifices the advantages of special data types (simpler code debugging). Is this the best solution?
A workaround* is to use numpy's linspace
:
In [11]: np.linspace(pd.Timestamp("23-FEB-2015 23:09:19.445506").value, pd.Timestamp("24-FEB-2015 01:09:22.404973").value, 50, dtype=np.int64)
Out[11]:
array([1424732959445506048, 1424733106444678912, 1424733253443851520,
1424733400443024384, 1424733547442197248, 1424733694441370112,
1424733841440542720, 1424733988439715584, 1424734135438888448,
1424734282438061312, 1424734429437233920, 1424734576436406784,
...
1424739133410763520, 1424739280409936384, 1424739427409108992,
1424739574408281856, 1424739721407454720, 1424739868406627584,
1424740015405800192, 1424740162404973056])
In [12]: pd.DatetimeIndex(np.linspace(pd.Timestamp("23-FEB-2015 23:09:19.445506").value, pd.Timestamp("24-FEB-2015 01:09:22.404973").value, 50, dtype=np.int64))
Out[12]:
DatetimeIndex(['2015-02-23 23:09:19.445506048',
'2015-02-23 23:11:46.444678912',
'2015-02-23 23:14:13.443851520',
'2015-02-23 23:16:40.443024384',
...
'2015-02-24 01:04:28.406627584',
'2015-02-24 01:06:55.405800192',
'2015-02-24 01:09:22.404973056'],
dtype='datetime64[ns]', freq=None)
*From using date_range
directly:
In [21]: pd.date_range("23-FEB-2015 23:09:19.445506", "24-FEB-2015 01:09:22.404973", periods=10**4)
...
ValueError: Must specify two of start, end, or periods
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With