I understand that OHLC re-sampling of time series data in Pandas, using one column of data, will work perfectly, for example on the following dataframe: <pre class="prettyprint"><code>>>df ctime openbid 1443654000 1.11700 1443654060 1.11700 ... df['ctime'] = pd.to_datetime(df['ctime'], unit='s') df = df.set_index('ctime') df.resample('1H', how='ohlc', axis=0, fill_method='bfill') >>> open high low close ctime 2015-09-30 23:00:00 1.11700 1.11700 1.11687 1.11697 2015-09-30 24:00:00 1.11700 1.11712 1.11697 1.11697 ... </code></pre> But what do I do if the data is already in an OHLC format? From what I can gather the OHLC method of the API calculates an OHLC slice for every column, hence if my data is in the format: <pre class="prettyprint"><code> ctime openbid highbid lowbid closebid 0 1443654000 1.11700 1.11700 1.11687 1.11697 1 1443654060 1.11700 1.11712 1.11697 1.11697 2 1443654120 1.11701 1.11708 1.11699 1.11708 </code></pre> When I try to re-sample I get an OHLC for each of the columns, like so: <pre class="prettyprint"><code> openbid highbid \ open high low close open high ctime 2015-09-30 23:00:00 1.11700 1.11700 1.11700 1.11700 1.11700 1.11712 2015-09-30 23:01:00 1.11701 1.11701 1.11701 1.11701 1.11708 1.11708 ... lowbid \ low close open high low close ctime 2015-09-30 23:00:00 1.11700 1.11712 1.11687 1.11697 1.11687 1.11697 2015-09-30 23:01:00 1.11708 1.11708 1.11699 1.11699 1.11699 1.11699 ... closebid open high low close ctime 2015-09-30 23:00:00 1.11697 1.11697 1.11697 1.11697 2015-09-30 23:01:00 1.11708 1.11708 1.11708 1.11708 </code></pre> Is there a quick(ish) workaround for this that someone is willing to share please, without me having to get knee-deep in pandas manual? Thanks. ps, there is this answer - Converting OHLC stock data into a different timeframe with python and pandas - but it was 4 years ago, so I am hoping there has been some progress.

This is similar to the answer you linked, but it a little cleaner, and faster, because it uses the optimized aggregations, rather than lambdas. Note that the <code>resample(...).agg(...)</code> syntax requires pandas version <code>0.18.0</code>. <pre class="prettyprint"><code>In [101]: df.resample('1H').agg({'openbid': 'first', 'highbid': 'max', 'lowbid': 'min', 'closebid': 'last'}) Out[101]: lowbid highbid closebid openbid ctime 2015-09-30 23:00:00 1.11687 1.11712 1.11708 1.117 </code></pre>

Pandas OHLC aggregation on OHLC data

Tags:

python

pandas

dataframe

python-2.7

resampling

I understand that OHLC re-sampling of time series data in Pandas, using one column of data, will work perfectly, for example on the following dataframe:

>>df ctime       openbid 1443654000  1.11700 1443654060  1.11700 ...  df['ctime']  = pd.to_datetime(df['ctime'], unit='s') df           = df.set_index('ctime') df.resample('1H',  how='ohlc', axis=0, fill_method='bfill')   >>>                      open     high     low       close ctime                                                    2015-09-30 23:00:00  1.11700  1.11700  1.11687   1.11697 2015-09-30 24:00:00  1.11700  1.11712  1.11697   1.11697 ...

But what do I do if the data is already in an OHLC format? From what I can gather the OHLC method of the API calculates an OHLC slice for every column, hence if my data is in the format:

             ctime  openbid  highbid   lowbid  closebid 0       1443654000  1.11700  1.11700  1.11687   1.11697 1       1443654060  1.11700  1.11712  1.11697   1.11697 2       1443654120  1.11701  1.11708  1.11699   1.11708

When I try to re-sample I get an OHLC for each of the columns, like so:

                     openbid                             highbid           \                         open     high      low    close     open     high    ctime                                                                        2015-09-30 23:00:00  1.11700  1.11700  1.11700  1.11700  1.11700  1.11712    2015-09-30 23:01:00  1.11701  1.11701  1.11701  1.11701  1.11708  1.11708  ...                                         lowbid                             \                          low    close     open     high      low    close    ctime                                                                        2015-09-30 23:00:00  1.11700  1.11712  1.11687  1.11697  1.11687  1.11697    2015-09-30 23:01:00  1.11708  1.11708  1.11699  1.11699  1.11699  1.11699   ...                      closebid                                                      open     high      low    close   ctime                                                     2015-09-30 23:00:00  1.11697  1.11697  1.11697  1.11697   2015-09-30 23:01:00  1.11708  1.11708  1.11708  1.11708

Is there a quick(ish) workaround for this that someone is willing to share please, without me having to get knee-deep in pandas manual?

Thanks.

ps, there is this answer - Converting OHLC stock data into a different timeframe with python and pandas - but it was 4 years ago, so I am hoping there has been some progress.

493

asked Mar 25 '16 15:03

user3439187

2 Answers

This is similar to the answer you linked, but it a little cleaner, and faster, because it uses the optimized aggregations, rather than lambdas.

Note that the resample(...).agg(...) syntax requires pandas version 0.18.0.

In [101]: df.resample('1H').agg({'openbid': 'first',                                   'highbid': 'max',                                   'lowbid': 'min',                                   'closebid': 'last'}) Out[101]:                        lowbid  highbid  closebid  openbid ctime                                                    2015-09-30 23:00:00  1.11687  1.11712   1.11708    1.117

answered Sep 22 '22 04:09

chrisb

You need to use an OrderedDict to keep row order in the newer versions of pandas, like so:

import pandas as pd from collections import OrderedDict  df['ctime'] = pd.to_datetime(df['ctime'], unit='s') df = df.set_index('ctime') df = df.resample('5Min').agg(     OrderedDict([         ('open', 'first'),         ('high', 'max'),         ('low', 'min'),         ('close', 'last'),         ('volume', 'sum'),     ]) )

answered Sep 22 '22 04:09

Benjamin Crouzier

Related questions
                            
                                django-cors-headers not work
                            
                                Absolute value for column in Python
                            
                                Complete set of punctuation marks for Python (not just ASCII)
                            
                                python subclass access to class variable of parent
                            
                                Lexical cast from string to type
                            
                                Putting arrowheads on vectors in matplotlib's 3d plot
                            
                                How do I create a sum row and sum column in pandas?
                            
                                MSSQL in python 2.7
                            
                                pip: how to install a git pull request
                            
                                Best and/or fastest way to create lists in python
                            
                                hashlib.md5() TypeError: Unicode-objects must be encoded before hashing
                            
                                django.core.servers.basehttp.FileWrapper disappears in Django 1.9
                            
                                Python: how to implement __getattr__()?
                            
                                Add edge-weights to plot output in networkx
                            
                                Standard deviation in numpy [duplicate]
                            
                                Django Rest Framework POST Update if existing or create
                            
                                __init__ vs __enter__ in context managers
                            
                                Is there an easy way to populate SlugField from CharField?
                            
                                Converting time zone pandas dataframe
                            
                                Single command in python to install relevant modules from a package.json like file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With