Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Longest consecutive count of the same value per group

I have a data.frame as below and I want to add a variable describing the longest consecutive count of 1 in the VALUE variable observed in the group (i.e. longest consecutive rows with 1 in VALUE per group).

GROUP_ID    VALUE
    1         0
    1         1
    1         1
    1         1
    1         1
    1         0
    2         1
    2         1
    2         0
    2         1
    2         1
    2         1
    3         1
    3         0
    3         1
    3         0

So the output would look like this:

GROUP_ID    VALUE    CONSECUTIVE
    1         0           4
    1         1           4
    1         1           4
    1         1           4
    1         1           4
    1         0           4
    2         1           3
    2         1           3
    2         0           3
    2         1           3
    2         1           3
    2         1           3
    3         1           1
    3         0           1
    3         1           1
    3         0           1

Any help would be greatly appreciated!

like image 951
Chao Avatar asked Dec 07 '18 14:12

Chao


2 Answers

Using dplyr:

library(dplyr)
dat %>% 
  group_by(GROUP_ID) %>% 
  mutate(CONSECUTIVE = {rl <- rle(VALUE); max(rl$lengths[rl$values == 1])})

which gives:

# A tibble: 16 x 3
# Groups:   GROUP_ID [3]
   GROUP_ID VALUE CONSECUTIVE
      <int> <int>       <int>
 1        1     0           4
 2        1     1           4
 3        1     1           4
 4        1     1           4
 5        1     1           4
 6        1     0           4
 7        2     1           3
 8        2     1           3
 9        2     0           3
10        2     1           3
11        2     1           3
12        2     1           3
13        3     1           1
14        3     0           1
15        3     1           1
16        3     0           1

Or with data.table:

library(data.table)
setDT(dat)            # convert to a 'data.table'

dat[, CONSECUTIVE := {rl <- rle(VALUE); max(rl$lengths[rl$values == 1])}
    , by = GROUP_ID][]
like image 152
Jaap Avatar answered Oct 06 '22 12:10

Jaap


We can use ave with rle and get maximum occurrence of consecutive 1's for each group. (GROUP_ID)

df$Consecutive <- ave(df$VALUE, df$GROUP_ID, FUN = function(x) {
      y <- rle(x == 1)
      max(y$lengths[y$values])
})

df
#   GROUP_ID VALUE Consecutive
#1         1     0           4
#2         1     1           4
#3         1     1           4
#4         1     1           4
#5         1     1           4
#6         1     0           4
#7         2     1           3
#8         2     1           3
#9         2     0           3
#10        2     1           3
#11        2     1           3
#12        2     1           3
#13        3     1           1
#14        3     0           1
#15        3     1           1
#16        3     0           1
like image 25
Ronak Shah Avatar answered Oct 06 '22 14:10

Ronak Shah