Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I identify consecutive/contiguous dates in polars

I'm trying to isolate runs of dates using polars. For this, I've been playing with .rolling, .rle, and .rle_id but can't seem to fit them together to make it work.

Given this:

import polars as pl

df = pl.DataFrame({
    "date": ["2024-01-01", "2024-01-02", "2024-01-03", "2024-01-05", "2024-01-06", "2024-01-07"]
})
df = df.with_columns(pl.col("date").str.to_date())

I would like to have a 2nd column with the id of the run if they're contiguous:

shape: (6, 2)
┌────────────┬─────┐
│ date       ┆ id  │
│ ---        ┆ --- │
│ date       ┆ i64 │
╞════════════╪═════╡
│ 2024-01-01 ┆ 1   │
│ 2024-01-02 ┆ 1   │
│ 2024-01-03 ┆ 1   │
│ 2024-01-05 ┆ 2   │
│ 2024-01-06 ┆ 2   │
│ 2024-01-07 ┆ 2   │
└────────────┴─────┘
like image 748
Thomas Avatar asked Oct 19 '25 14:10

Thomas


1 Answers

You could take the diff of the successive dates, identify those that are not 1 day and compute a cum_sum:

(pl.DataFrame({'date': ['2024-01-01', '2024-01-02', '2024-01-03',
                        '2024-01-05', '2024-01-06', '2024-01-07']})
  .with_columns(pl.col('date').str.to_date())
  .with_columns(id=pl.col('date').diff().ne(pl.duration(days=1))
                     .fill_null(True).cum_sum())
)

Alternatively, generate a range of the step (with pl.duration), subtract it to the date and use rle_id:

(pl.DataFrame({'date': ['2024-01-01', '2024-01-02', '2024-01-03',
                        '2024-01-05', '2024-01-06', '2024-01-07']})
  .with_columns(pl.col('date').str.to_date())
  .with_columns(id=(pl.col('date')
                   -pl.duration(days=pl.arange(0, pl.col('date').len()))
                   ).rle_id()+1)
)

Output:

┌────────────┬─────┐
│ date       ┆ id  │
│ ---        ┆ --- │
│ date       ┆ u32 │
╞════════════╪═════╡
│ 2024-01-01 ┆ 1   │
│ 2024-01-02 ┆ 1   │
│ 2024-01-03 ┆ 1   │
│ 2024-01-05 ┆ 2   │
│ 2024-01-06 ┆ 2   │
│ 2024-01-07 ┆ 2   │
└────────────┴─────┘
like image 55
mozway Avatar answered Oct 21 '25 04:10

mozway