Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to perform a group_by with elements that are contiguous in R and dplyr

Tags:

r

dplyr

tidyverse

Suppose we have this tibble:

 group item
 x     1
 x     2
 x     2
 y     3
 z     2
 x     2
 x     2
 z     1

I want to perform a group_by by group. However, I'd rather group only by the elements that are adjacent. For example, in my case, I'd have three 'x' groups, summing 'item' elements. The result would be something like:

group item
x 5
y 3
z 2
x 4
z 1

I know how to solve this problem using 'for' loops. However, this is not fast and doesn't sound straightforward. I'd rather use some dplyr or tidyverse function with an easy logic.

This question is not duplicated. I know there's already a question about rle in SO, but my question was more general than that. I asked for general solutions.

like image 458
Guilherme Jardim Duarte Avatar asked Nov 20 '25 05:11

Guilherme Jardim Duarte


2 Answers

If you want to use only base R + tidyverse, this code exactly replicates your desired results

mydf <- tibble(group = c("x", "x", "x", "y", "z", "x", "x", "z"), 
                   item = c(1, 2, 2, 3, 2, 2, 2, 1))

mydf

# A tibble: 8 × 2
  group  item
  <chr> <dbl>
1     x     1
2     x     2
3     x     2
4     y     3
5     z     2
6     x     2
7     x     2
8     z     1

runs <- rle(mydf$group)

mydf %>% 
  mutate(run_id = rep(seq_along(runs$lengths), runs$lengths)) %>% 
  group_by(group, run_id) %>% 
  summarise(item = sum(item)) %>% 
  arrange(run_id) %>% 
  select(-run_id) 

Source: local data frame [5 x 2]
Groups: group [3]

  group  item
  <chr> <dbl>
1     x     5
2     y     3
3     z     2
4     x     4
5     z     1
like image 115
HAVB Avatar answered Nov 22 '25 19:11

HAVB


You can construct group identifiers with rle, but the easier route is to just use data.table::rleid, which does it for you:

library(dplyr)

df %>% 
    group_by(group, 
             group_run = data.table::rleid(group)) %>% 
    summarise_all(sum)
#> # A tibble: 5 x 3
#> # Groups:   group [?]
#>    group group_run  item
#>   <fctr>     <int> <int>
#> 1      x         1     5
#> 2      x         4     4
#> 3      y         2     3
#> 4      z         3     2
#> 5      z         5     1
like image 41
alistaire Avatar answered Nov 22 '25 19:11

alistaire



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!