Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reduce each consecutive sequence to its value and length

Assume you have a vector with runs of consecutive values:

v <- c(1, 1, 1,  2, 2, 2, 2,  1, 1,  3, 3, 3, 3)

How can it be best reduced to one value per run and the length of each run. I.e. the first run is 1 repeated two times; 2nd run: 2 repeated four times; 3rd run: 1 repeated two times, and so on:

v.df <- data.frame(value = c(1, 2, 1, 3),
                   repetitions = c(3, 4, 2, 4))

In a procedural language I might just iterate through a loop and build the data.frame as I go, but with a large dataset in R such an approach is inefficient. Any advice?

like image 685
russellpierce Avatar asked Dec 01 '22 10:12

russellpierce


2 Answers

or more simply

data.frame(rle(v)[])
like image 83
kohske Avatar answered Dec 05 '22 02:12

kohske


with(rle(v), data.frame(values, lengths))

should get you what you need.

values lengths
     1       3
     2       4
     1       2
     3       4
like image 27
Greg Avatar answered Dec 05 '22 03:12

Greg