Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating sequences based on summary counts

Tags:

r

I am trying to construct a sequence of rows/values from the following data:

# A tibble: 4 x 2
  year_row breaks
  <chr>     <int>
1 2015          7
2 2016          6
3 2017          5
4 2018          5

That is;

7+6 = 13

+5 = 18

+5 = 23

Expected output:

2015     1:7
2016     8:13
2017     14:18
2018     19:23

Where I can then use later the sequences in some function/loops

Data:

structure(list(year_row = c("2015", "2016", "2017", "2018"), 
    breaks = c(7L, 6L, 5L, 5L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L))
like image 890
user8959427 Avatar asked Apr 23 '19 12:04

user8959427


2 Answers

We take the cumulative sum of the 'breaks' and lag of 'breaks' and then do a paste

library(dplyr)
library(stringr)
df1 %>% 
   mutate(new = cumsum(breaks), 
          new2 =cumsum( lag(breaks, default = 0)) + 1) %>%
   transmute(year_row, new3 = str_c(new2, new, sep=":"))
# A tibble: 4 x 2
#  year_row new3 
#  <chr>    <chr>
#1 2015     1:7  
#2 2016     8:13 
#3 2017     14:18
#4 2018     19:23
like image 156
akrun Avatar answered Oct 12 '22 23:10

akrun


An idea via base R,

v1 <- cumsum(df$breaks)
v2 <- c(1, v1+1)
paste(v2[-length(v2)], v1, sep = ':')
#[1] "1:7"   "8:13"  "14:18" "19:23"

If you want to have them as actual vectors, then we can use Map.

Assuming that we have already constructed v1 and v2 as shown above, then,

Map(`:`, v2[-length(v2)], v1)
#[[1]]
#[1] 1 2 3 4 5 6 7

#[[2]]
#[1]  8  9 10 11 12 13

#[[3]]
#[1] 14 15 16 17 18

#[[4]]
#[1] 19 20 21 22 23

Attaching it to your data frame,

df$ranges <- Map(`:`, v2[-length(v2)], v1)
df
# A tibble: 4 x 3
#  year_row breaks ranges   
#  <chr>     <int> <list>   
#1 2015          7 <int [7]>
#2 2016          6 <int [6]>
#3 2017          5 <int [5]>
#4 2018          5 <int [5]>
like image 6
Sotos Avatar answered Oct 13 '22 01:10

Sotos