Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rolling window of next rows in Polars

I want to create a new column that contains the max, min and average of the next X elements.

After searching for a while the only solution I was able to find was:

df = df.with_columns(
    pl.col("price").shift(-20).rolling_max(window_size=20).alias('max'), 
    pl.col("price").shift(-20).rolling_min(window_size=20).alias('min')
)

However for the first 20 rows there is no data and results in nulls because the .rolling_max function applies the windows of the previous rows and not the next.

Is there a better way to achieve this?

like image 208
elias Avatar asked Oct 22 '25 10:10

elias


2 Answers

@RomanPekar's solution works well for the creation of few columns. However, when working with multiple columns it could be preferable to use a full forward-looking rolling context using pl.DataFrame.group_by_dynamic as follows.

Data.

df = pl.DataFrame({
    "price": [1, 2, 6, 1, 3, 5, 8, 7],
})

Usage of pl.DataFrame.group_by_dynamic. I've added multiple aggregations of the rolling window and a separate window column to clarify the approach.

(
    df
    .group_by_dynamic(
        index_column=pl.int_range(0, pl.len()),
        every="1i", period="2i", offset="0i",
    )
    .agg(
        pl.col("price").first(),
        pl.col("price").alias("window"),
        pl.col("price").max().alias("min"),
        pl.col("price").max().alias("max"),
    )
)

Output.

shape: (8, 5)
┌─────────┬───────┬───────────┬─────┬─────┐
│ literal ┆ price ┆ window    ┆ min ┆ max │
│ ---     ┆ ---   ┆ ---       ┆ --- ┆ --- │
│ i64     ┆ i64   ┆ list[i64] ┆ i64 ┆ i64 │
╞═════════╪═══════╪═══════════╪═════╪═════╡
│ 0       ┆ 1     ┆ [1, 2]    ┆ 2   ┆ 2   │
│ 1       ┆ 2     ┆ [2, 6]    ┆ 6   ┆ 6   │
│ 2       ┆ 6     ┆ [6, 1]    ┆ 6   ┆ 6   │
│ 3       ┆ 1     ┆ [1, 3]    ┆ 3   ┆ 3   │
│ 4       ┆ 3     ┆ [3, 5]    ┆ 5   ┆ 5   │
│ 5       ┆ 5     ┆ [5, 8]    ┆ 8   ┆ 8   │
│ 6       ┆ 8     ┆ [8, 7]    ┆ 8   ┆ 8   │
│ 7       ┆ 7     ┆ [7]       ┆ 7   ┆ 7   │
└─────────┴───────┴───────────┴─────┴─────┘

This approach also allows a separate by column to perform these aggregations within groups defined by a separate column.

like image 196
Hericks Avatar answered Oct 24 '25 00:10

Hericks


I think you can use reverse() to apply rolling window to next rows insted of previous ones:

I've made a bit more simple example so you can see the results easier:

df = pl.DataFrame({
    "price": [1, 2, 6, 1, 3, 5, 8, 7],
})

┌───────┐
│ price │
│ ---   │
│ i64   │
╞═══════╡
│ 1     │
│ 2     │
│ 6     │
│ 1     │
│ 3     │
│ 5     │
│ 8     │
│ 7     │
└───────┘
df.with_columns(
    pl.col("price")
      .reverse()
      .rolling_max(window_size=2, min_periods=1)
      .reverse()
      .alias('max')
)

┌───────┬─────┐
│ price ┆ max │
│ ---   ┆ --- │
│ i64   ┆ i64 │
╞═══════╪═════╡
│ 1     ┆ 2   │ # max of [1,2]
│ 2     ┆ 6   │ # max of [2,6]
│ 6     ┆ 6   │ # max of [6,1]
│ 1     ┆ 3   │ # max of [1,3]
│ 3     ┆ 5   │ # max of [3,5]
│ 5     ┆ 8   │ # max of [5,8]
│ 8     ┆ 8   │ # max of [8,7]
│ 7     ┆ 7   │ # max of [7]
└───────┴─────┘
like image 26
Roman Pekar Avatar answered Oct 24 '25 01:10

Roman Pekar