I'm trying to transform a (well, many) column of return data to a column of closing prices. In Clojure, I'd use reductions
, which is like reduce
, but returns a sequence of all the intermediate values.
e.g.
$ c
0.12
-.13
0.23
0.17
0.29
-0.11
# something like this
$ c.reductions(init=1, lambda accumulator, ret: accumulator * (1 + ret))
1.12
0.97
1.20
1.40
1.81
1.61
NB: The actual closing price doesn't matter, hence using 1 as the initial value. I just need a "mock" closing price.
My data's actual structure is a DataFrame of named columns of TimeSeries. I guess I'm looking for a function similar applymap
, but I'd rather not do something hacky with that function and reference the DF from within it (which I suppose is one solution to this problem?)
Additionally, what would I do if I wanted to keep the returns
data, but have the closing "price" with it? Should I return a tuple instead, and have the TimeSeries be of the type (returns, closing_price)
?
It doesn't look like it's a well publicized feature yet, but you can use expanding_apply
to achieve the returns calculation:
In [1]: s
Out[1]:
0 0.12
1 -0.13
2 0.23
3 0.17
4 0.29
5 -0.11
In [2]: pd.expanding_apply(s ,lambda s: reduce(lambda x, y: x * (1+y), s, 1))
Out[2]:
0 1.120000
1 0.974400
2 1.198512
3 1.402259
4 1.808914
5 1.609934
I'm not 100% certain, but I believe expanding_apply
works on the applied series starting from the first index through the current index. I use the built-in reduce
function that works exactly like your Clojure function.
Docstring for expanding_apply
:
Generic expanding function application
Parameters
----------
arg : Series, DataFrame
func : function
Must produce a single value from an ndarray input
min_periods : int
Minimum number of observations in window required to have a value
freq : None or string alias / date offset object, default=None
Frequency to conform to before computing statistic
center : boolean, default False
Whether the label should correspond with center of window
Returns
-------
y : type of input argument
It's worth noting that it's often faster (as well as easier to understand) to write more verbosely in pandas, rather than write as a reduce
.
In your specific example I would just add
and then cumprod
:
In [2]: c.add(1).cumprod()
Out[2]:
0 1.120000
1 0.974400
2 1.198512
3 1.402259
4 1.808914
5 1.609934
or perhaps init * c.add(1).cumprod()
.
Note: In some cases however, for example where memory is an issue, you may have to rewrite these in a more low-level/clever way, but it's usually worth trying the simplest method first (and testing against it e.g. using %timeit or profiling memory).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With