Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to resample pandas df tick data to 5 min OHLC data

I have a pandas df 'instr_bar' with tick data as follows:

time
2016-07-29 16:07:24     5.72
2016-07-29 16:07:24     5.72
2016-07-29 16:07:24     5.72
2016-07-29 16:07:58     5.72
2016-07-29 16:07:58     5.72
2016-07-29 16:09:49     5.70
2016-07-29 16:09:50     5.73
2016-07-29 16:11:14     5.73
2016-07-29 16:11:14     5.73
2016-07-29 16:14:53     5.77
2016-07-29 16:14:53     5.77
2016-07-29 16:17:27     5.75
2016-07-29 16:17:43     5.76
2016-07-29 16:17:43     5.76

I want to turn this into 5 minute OHLC. The index is not unique in many instances.

I then use the following code : instr_bar = instr_bar.resample('5Min').ohlc()

I then get the following df:

                     open   high    low  close
time                                           
2016-07-29 15:40:00   5.74   5.74   5.74   5.74
2016-07-29 15:45:00    NaN    NaN    NaN    NaN
2016-07-29 15:50:00   5.75   5.75   5.75   5.75
2016-07-29 15:55:00   5.75   5.75   5.72   5.72
2016-07-29 16:00:00   5.72   5.72   5.72   5.72
2016-07-29 16:05:00   5.72   5.73   5.70   5.73
2016-07-29 16:10:00   5.73   5.77   5.73   5.77
2016-07-29 16:15:00   5.75   5.76   5.72   5.72
2016-07-29 16:20:00    NaN    NaN    NaN    NaN
2016-07-29 16:25:00   5.72   5.72   5.72   5.72

Q1: How do I backfill the NaNs with last observed values?

Q2: I now also got NaNs outside the trading/opening ours (09:00 - 16:30), how do I get rid of them?

like image 443
cJc Avatar asked Jul 30 '16 10:07

cJc


People also ask

How do I resample data in pandas?

Pandas Series: resample() functionThe resample() function is used to resample time-series data. Convenience method for frequency conversion and resampling of time series. Object must have a datetime-like index (DatetimeIndex, PeriodIndex, or TimedeltaIndex), or pass datetime-like values to the on or level keyword.

What does DF resample do?

Pandas DataFrame. resample() takes in a DatetimeIndex and spits out data that has been converted to a new time frequency. Pseudo Code: Convert a DataFrame time range into a different time frequency.


1 Answers

try bfill():

instr_bar = instr_bar.resample('5T').ohlc().bfill()

or ffill():

instr_bar = instr_bar.resample('5T').ohlc().ffill()

depending on what do you want to achieve

if you want to filter rows by time you can use between_time() method:

instr_bar.between_time('09:00', '16:30')

altogether:

instr_bar = instr_bar.resample('5T').ohlc().ffill().between_time('09:00', '16:30')
like image 166
MaxU - stop WAR against UA Avatar answered Sep 23 '22 15:09

MaxU - stop WAR against UA