Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sum values by groups in sequence

Tags:

r

sum

grouping

I have a dataframe with duration values in column duration and group values in column gaze_focus.

df1
   duration gaze_focus
29    1.011  periphery
31    1.590     center
33    1.582     center
35    0.571  periphery
37    0.561     center
39    2.136     center
41    0.181  periphery
43    1.475     center
45    0.177  periphery
47    0.940  periphery
49    2.102     center

I'd like to compute the sums of immediately adjacent identical group values to obtain this result:

df2
  duration gaze_focus
1    1.011  periphery
2    3.172     center
3    0.571  periphery
4    2.697     center
5    0.181  periphery
6    1.475     center
7    1.117  periphery
8    2.102     center

I know that mathematical operations such as summing by group can be done using e.g. aggregateor tapply but I don't know how to sum values by groups in little chunks. Help is appreciated!

Reproducible data:

df1 <- structure(list(duration = c(1.011, 1.59, 1.582, 0.571, 0.561, 
2.136, 0.181, 1.475, 0.177, 0.94, 2.102), gaze_focus = c("periphery", 
"center", "center", "periphery", "center", "center", "periphery", 
"center", "periphery", "periphery", "center")), row.names = c(29L, 
31L, 33L, 35L, 37L, 39L, 41L, 43L, 45L, 47L, 49L), class = "data.frame")
like image 230
Chris Ruehlemann Avatar asked Sep 17 '25 15:09

Chris Ruehlemann


1 Answers

One dplyr option could be:

df1 %>%
 group_by(gaze_focus, rleid = with(rle(gaze_focus), rep(seq_along(lengths), lengths))) %>%
 summarise_all(sum) %>%
 arrange(rleid)

  gaze_focus rleid duration
  <chr>      <int>    <dbl>
1 periphery      1    1.01 
2 center         2    3.17 
3 periphery      3    0.571
4 center         4    2.70 
5 periphery      5    0.181
6 center         6    1.48 
7 periphery      7    1.12 
8 center         8    2.10 
like image 173
tmfmnk Avatar answered Sep 19 '25 05:09

tmfmnk