Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - Split numeric vector into intervals

I have a question regarding the "splitting" of a vector, although different approaches might be feasible. I have a data.frame(df) which looks like this (simplified version):

   case time
1   1   5
2   2   3
3   3   4

The "time" variable counts units of time (days, weeks etc) until an event occurs. I would like to expand the data set by increasing the number of rows and "split" the "time" into intervals of length 1, beginning at 2. The result might then look something like this:

    case    time    begin   end
1   1       5       2       3
2   1       5       3       4
3   1       5       4       5
4   2       3       2       3
5   3       4       2       3
6   3       4       3       4

Obviously, my data set is a bit larger than this example. What would be a feasible method to achieve this result?

I had one idea of beginning with

df.exp <- df[rep(row.names(df), df$time - 2), 1:2]

in order to expand the number of rows per case, according to the number of time intervals. Based on this, a "begin" and "end" column might be added in the fashion of:

df.exp$begin <- 2:(df.exp$time-1)

However, I'm not successful at creating the respective columns, because this command only uses the first row to calculate (df.exp$time-1) and doesn't automatically distinguish by "case".

Any ideas would be very much appreciated!

like image 716
Fabian Avatar asked Jul 08 '15 11:07

Fabian


People also ask

How to split vector and data frames into various groups in R?

However, merging and splitting is a common operation in any programming language, and today, we will see how to split vector and data frames into various groups in R. The split () is a built-in R function that divides the Vector or data frame into the groups defined by the function.

How to divide a vector into different ranges in R?

cut () function in R Language is used to divide a numeric vector into different ranges. Writing code in comment? Please use ide.geeksforgeeks.org , generate link and share the link here.

What is the difference between Split() and unsplit() in R?

The split () is a built-in R function that divides the Vector or data frame into the groups defined by the function. It accepts the vector or data frame as an argument and returns the data into groups. The unsplit () function in R does the reverse of the split () function.

How to split a vector by the number of chunks?

Method 2: By using the number of chunks 1 vector is the input vector 2 split () function is used to split the vector 3 cut () is the function that takes three parameters one parameter that is a vector with sequence along to divide the... More ...


1 Answers

You can try

df2 <- df1[rep(1:nrow(df1), df1$time-2),]
row.names(df2) <- NULL
m1 <- do.call(rbind,
          Map(function(x,y) {
                  v1 <- seq(x,y)
                  cbind(v1[-length(v1)],v1[-1L])},
                  2, df1$time))
df2[c('begin', 'end')] <- m1
df2
#  case time begin end
#1    1    5     2   3
#2    1    5     3   4
#3    1    5     4   5
#4    2    3     2   3
#5    3    4     2   3
#6    3    4     3   4

Or an option with data.table

library(data.table)
setDT(df1)[,{tmp <- seq(2, time)
               list(time= time,
                    begin= tmp[-length(tmp)],
                    end=tmp[-1])} , by = case]
#   case time begin end
#1:    1    5     2   3
#2:    1    5     3   4
#3:    1    5     4   5
#4:    2    3     2   3
#5:    3    4     2   3
#6:    3    4     3   4
like image 113
akrun Avatar answered Sep 29 '22 07:09

akrun