Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to invoke pandas.rolling.apply with parameters from multiple column?

Tags:

python

pandas

I've got a dataset:

    Open     High      Low    Close         0  132.960  133.340  132.940  133.105 1  133.110  133.255  132.710  132.755 2  132.755  132.985  132.640  132.735  3  132.730  132.790  132.575  132.685 4  132.685  132.785  132.625  132.755 

I try to use rolling.apply function for all rows, like this:

df['new_col']= df[['Open']].rolling(2).apply(AccumulativeSwingIndex(df['High'],df['Low'],df['Close'])) 
  • shows error

or

df['new_col']=  df[['Open', 'High', 'Low', 'Close']].rolling(2).apply(AccumulativeSwingIndex) 
  • pass only parameter from column 'Open'

Can anybody help me?

like image 337
quarkpol Avatar asked Aug 10 '16 16:08

quarkpol


People also ask

Can you pop multiple columns pandas?

If you need to remove multiple columns from your dataset, you can either . pop() multiple times, or use pandas . drop() instead.

How does rolling work in pandas?

Window Rolling Mean (Moving Average)The moving average calculation creates an updated average value for each row based on the window we specify. The calculation is also called a “rolling mean” because it's calculating an average of values within a specified range for each row as you go along the DataFrame.

How do I pass a specific column in pandas?

Selecting columns based on their name This is the most basic way to select a single column from a dataframe, just put the string name of the column in brackets. Returns a pandas series. Passing a list in the brackets lets you select multiple columns at the same time.


2 Answers

Define your own roll

We can create a function that takes a window size argument w and any other keyword arguments. We use this to build a new DataFrame in which we will call groupby on while passing on the keyword arguments via kwargs.

Note: I didn't have to use stride_tricks.as_strided but it is succinct and in my opinion appropriate.
from numpy.lib.stride_tricks import as_strided as stride import pandas as pd  def roll(df, w, **kwargs):     v = df.values     d0, d1 = v.shape     s0, s1 = v.strides      a = stride(v, (d0 - (w - 1), w, d1), (s0, s0, s1))      rolled_df = pd.concat({         row: pd.DataFrame(values, columns=df.columns)         for row, values in zip(df.index, a)     })      return rolled_df.groupby(level=0, **kwargs)  roll(df, 2).mean()         Open      High       Low    Close 0  133.0350  133.2975  132.8250  132.930 1  132.9325  133.1200  132.6750  132.745 2  132.7425  132.8875  132.6075  132.710 3  132.7075  132.7875  132.6000  132.720 

We can also use the pandas.DataFrame.pipe method to the same effect:

df.pipe(roll, w=2).mean() 


OLD ANSWER

Panel has been deprecated. See above for updated answer.

see https://stackoverflow.com/a/37491779/2336654

define our own roll

def roll(df, w, **kwargs):     roll_array = np.dstack([df.values[i:i+w, :] for i in range(len(df.index) - w + 1)]).T     panel = pd.Panel(roll_array,                       items=df.index[w-1:],                      major_axis=df.columns,                      minor_axis=pd.Index(range(w), name='roll'))     return panel.to_frame().unstack().T.groupby(level=0, **kwargs) 

you should be able to:

roll(df, 2).apply(your_function) 

Using mean

roll(df, 2).mean()  major      Open      High       Low    Close 1      133.0350  133.2975  132.8250  132.930 2      132.9325  133.1200  132.6750  132.745 3      132.7425  132.8875  132.6075  132.710 4      132.7075  132.7875  132.6000  132.720 

f = lambda df: df.sum(1)  roll(df, 2, group_keys=False).apply(f)     roll 1  0       532.345    1       531.830 2  0       531.830    1       531.115 3  0       531.115    1       530.780 4  0       530.780    1       530.850 dtype: float64 
like image 155
piRSquared Avatar answered Oct 06 '22 15:10

piRSquared


As your rolling window is not too large, I think you can also put them in the same dataframe then use the apply function to reduce.

For example, with the dataset df as following

            Open    High        Low     Close Date                 2017-11-07  258.97  259.3500    258.09  258.67 2017-11-08  258.47  259.2200    258.15  259.11 2017-11-09  257.73  258.3900    256.36  258.17 2017-11-10  257.73  258.2926    257.37  258.09 2017-11-13  257.31  258.5900    257.27  258.33 

You can just add the rolling data to this dataframe with

window = 2 df1 = pd.DataFrame(index=df.index) for i in range(window):     df_shifted = df.shift(i).copy()     df_shifted.columns = ["{}-{}".format(s, i) for s in df.columns]     df1 = df1.join(df_shifted) df1             Open-0   High-0      Low-0   Close-0 Open-1  High-1      Low-1   Close-1 Date                                 2017-11-07  258.97  259.3500    258.09  258.67  NaN     NaN         NaN     NaN 2017-11-08  258.47  259.2200    258.15  259.11  258.97  259.3500    258.09  258.67 2017-11-09  257.73  258.3900    256.36  258.17  258.47  259.2200    258.15  259.11 2017-11-10  257.73  258.2926    257.37  258.09  257.73  258.3900    256.36  258.17 2017-11-13  257.31  258.5900    257.27  258.33  257.73  258.2926    257.37  258.09 

Then you can make an apply on it easily with all the rolling data you want with

df1.apply(AccumulativeSwingIndex, axis=1) 
like image 21
aliciawyy Avatar answered Oct 06 '22 15:10

aliciawyy