I've got a dataset:
Open High Low Close
0 132.960 133.340 132.940 133.105
1 133.110 133.255 132.710 132.755
2 132.755 132.985 132.640 132.735
3 132.730 132.790 132.575 132.685
4 132.685 132.785 132.625 132.755
I try to use rolling.apply function for all rows, like this:
df['new_col']= df[['Open']].rolling(2).apply(AccumulativeSwingIndex(df['High'],df['Low'],df['Close']))
or
df['new_col']= df[['Open', 'High', 'Low', 'Close']].rolling(2).apply(AccumulativeSwingIndex)
Can anybody help me?
roll
We can create a function that takes a window size argument w
and any other keyword arguments. We use this to build a new DataFrame
in which we will call groupby
on while passing on the keyword arguments via kwargs
.
stride_tricks.as_strided
but it is succinct and in my opinion appropriate.
from numpy.lib.stride_tricks import as_strided as stride
import pandas as pd
def roll(df, w, **kwargs):
v = df.values
d0, d1 = v.shape
s0, s1 = v.strides
a = stride(v, (d0 - (w - 1), w, d1), (s0, s0, s1))
rolled_df = pd.concat({
row: pd.DataFrame(values, columns=df.columns)
for row, values in zip(df.index, a)
})
return rolled_df.groupby(level=0, **kwargs)
roll(df, 2).mean()
Open High Low Close
0 133.0350 133.2975 132.8250 132.930
1 132.9325 133.1200 132.6750 132.745
2 132.7425 132.8875 132.6075 132.710
3 132.7075 132.7875 132.6000 132.720
We can also use the pandas.DataFrame.pipe
method to the same effect:
df.pipe(roll, w=2).mean()
Panel
has been deprecated. See above for updated answer.
see https://stackoverflow.com/a/37491779/2336654
define our own roll
def roll(df, w, **kwargs):
roll_array = np.dstack([df.values[i:i+w, :] for i in range(len(df.index) - w + 1)]).T
panel = pd.Panel(roll_array,
items=df.index[w-1:],
major_axis=df.columns,
minor_axis=pd.Index(range(w), name='roll'))
return panel.to_frame().unstack().T.groupby(level=0, **kwargs)
you should be able to:
roll(df, 2).apply(your_function)
Using mean
roll(df, 2).mean()
major Open High Low Close
1 133.0350 133.2975 132.8250 132.930
2 132.9325 133.1200 132.6750 132.745
3 132.7425 132.8875 132.6075 132.710
4 132.7075 132.7875 132.6000 132.720
f = lambda df: df.sum(1)
roll(df, 2, group_keys=False).apply(f)
roll
1 0 532.345
1 531.830
2 0 531.830
1 531.115
3 0 531.115
1 530.780
4 0 530.780
1 530.850
dtype: float64
As your rolling window is not too large, I think you can also put them in the same dataframe then use the apply
function to reduce.
For example, with the dataset df
as following
Open High Low Close
Date
2017-11-07 258.97 259.3500 258.09 258.67
2017-11-08 258.47 259.2200 258.15 259.11
2017-11-09 257.73 258.3900 256.36 258.17
2017-11-10 257.73 258.2926 257.37 258.09
2017-11-13 257.31 258.5900 257.27 258.33
You can just add the rolling data to this dataframe with
window = 2
df1 = pd.DataFrame(index=df.index)
for i in range(window):
df_shifted = df.shift(i).copy()
df_shifted.columns = ["{}-{}".format(s, i) for s in df.columns]
df1 = df1.join(df_shifted)
df1
Open-0 High-0 Low-0 Close-0 Open-1 High-1 Low-1 Close-1
Date
2017-11-07 258.97 259.3500 258.09 258.67 NaN NaN NaN NaN
2017-11-08 258.47 259.2200 258.15 259.11 258.97 259.3500 258.09 258.67
2017-11-09 257.73 258.3900 256.36 258.17 258.47 259.2200 258.15 259.11
2017-11-10 257.73 258.2926 257.37 258.09 257.73 258.3900 256.36 258.17
2017-11-13 257.31 258.5900 257.27 258.33 257.73 258.2926 257.37 258.09
Then you can make an apply on it easily with all the rolling data you want with
df1.apply(AccumulativeSwingIndex, axis=1)
Here's a workaround I came up with:
df['new_col'] = list(map(fn, df.rolling(2)))
I also encountered some problems alike. the following lines may help you out. this might be the simplest solution for retrieving the data(matrices) within dataframe.rolling(), after which we can do almost anything with it. As comparison, d.rolling().apply() only allows aggregation functions.
size = 20
matrices = [x.values for x in d.rolling(size)][size-1:]
len(matrices)
[do_anything(i) for i in matrices]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With