Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expand values of a column to n rows before and m rows after after the value in data frame

I have a data.frame representing different time series. In one column, I marked interesting time points (Note: There can be multiple interesting time points per Id):

Id Time Value Interesting
1 0 12 0
1 1 14 0
1 2 11 0
1 3 12 1
1 4 13 0
1 5 14 0
1 6 12 0
1 7 12 0
.. .. .. ..
78 128 13

Now, I would like to mark also n time points before and m points afterward as an interesting block. So if n = 2 and m = 3 I would expect this:

Id Time Value Interesting Block
1 0 12 0 0
1 1 14 0 1
1 2 11 0 1
1 3 12 1 1
1 4 13 0 1
1 5 14 0 1
1 6 12 0 1
1 7 12 0 0
.. .. .. .. ..
78 128 13 0 0

At the moment, I use a gaussianSmooth() and a threshold:

df %>% mutate(Block = ifelse(gaussianSmooth(Interesting, sigma = 4) > 0.001, 1, 0))

But this is cumbersome works and only works if n = m. Is there a “simpler” solution where I can easily set how many rows before and after should be changed. Solutions preferable in dplyr/tidyverse.

like image 616
WitheShadow Avatar asked Sep 14 '25 18:09

WitheShadow


1 Answers

With group_modify (works for multiple Interesting values too). Get the indices you like: here the position when Interesting == 1, and then iteratively replace surrounding values with 1 (max(0, i - n):min(nrow(.x), i + m)).

library(dplyr)
n = 2
m = 3

df %>% 
  group_by(Id) %>% 
  group_modify(~ {
    idx <- which(.x$Interesting == 1)
    for(i in idx){
      .x$Interesting[max(0, i - n):min(nrow(.x), i + m)] <- 1
    }
    .x
  })

# A tibble: 8 × 4
# Groups:   Id [1]
     Id  Time Value Interesting
  <int> <int> <int>       <dbl>
1     1     0    12           0
2     1     1    14           1
3     1     2    11           1
4     1     3    12           1
5     1     4    13           1
6     1     5    14           1
7     1     6    12           1
8     1     7    12           0
like image 84
Maël Avatar answered Sep 17 '25 10:09

Maël