I'd like to calculate the determinant of 2x2 matrices which are taken by rolling a window of size 2 on a Nx2 matrix. I'm just using the determinant as an example function. In general, I'd like to apply a function to a dataframe which is taken by windowing a larger dataframe.
For example, this is a single 2x2 matrix and I calculate the determinant like so:
import pandas as pd
import numpy as np
d = pd.DataFrame({
   "X": [1,2],
   "Y": [3,4]
   })
np.linalg.det(d)
Now, I can form 4 2x2 matrices by sliding a window of size 2 along axis=0 of the following dataframe:
df = pd.DataFrame({
    "A": [1,2,3,4,5],
    "B": [6,7,8,9,10],
  })
which looks like:
    A   B
0   1   6
1   2   7
2   3   8
3   4   9
4   5   10
so I would get [-5., -5., -5., -5.]
As far as I can see, pandas.DataFrame.rolling and rolling.apply can only be applied on a 1D vector, not a dataframe? How would you do this?
The min_periods argument specifies the minimum number of observations in the current window required to generate a rolling value; otherwise, the result is NaN .
Slicing a DataFrame in Pandas includes the following steps:Ensure Python is installed (or install ActivePython) Import a dataset. Create a DataFrame. Slice the DataFrame.
Apply a function along an axis of the DataFrame. Objects passed to the function are Series objects whose index is either the DataFrame's index ( axis=0 ) or the DataFrame's columns ( axis=1 ). By default ( result_type=None ), the final return type is inferred from the return type of the applied function.
Extract a numpy array from your dataframe:
>>> array = df.values
>>> array
array([[ 1,  6],
       [ 2,  7],
       [ 3,  8],
       [ 4,  9],
       [ 5, 10]])
Use numpy's as_strided function to create your sliding window view:
>>> from numpy.lib.stride_tricks import as_strided
>>> rows, cols = array.shape
>>> row_stride, col_stride = array.strides
>>> windowed_array = as_strided(
...     array,
...     shape=(rows - 2 + 1, 2, cols),
...     strides=(row_stride, row_stride, col_stride))
>>> windowed_array
array([[[ 1,  6],
        [ 2,  7]],
       [[ 2,  7],
        [ 3,  8]],
       [[ 3,  8],
        [ 4,  9]],
       [[ 4,  9],
        [ 5, 10]]])
And now apply your function to the resulting array:
>>> np.linalg.det(windowed_array)
array([-5., -5., -5., -5.])
                        #You can replace np.linalg.det with other functions as you like.
#use apply to get 'A' and 'B' from current row and next row and feed them into the function.
df.apply(lambda x: np.linalg.det(df.loc[x.name:x.name+1, 'A':'B']) if x.name <(len(df)-1) else None,axis=1)
Out[157]: 
0   -5.0
1   -5.0
2   -5.0
3   -5.0
4    NaN
dtype: float64
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With