Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get start and end index of runs of values [duplicate]

Tags:

r

sequence

I have a vector:

a <- c(1, 1, 0, 0, 1, 2, 0, 0)

I would like to get the start and end indexes of each run of equal values:

number start  end
0        3     4
0        7     8
1        1     2
1        5     5
2        6     6
like image 602
janicebaratheon Avatar asked Oct 26 '17 18:10

janicebaratheon


3 Answers

A solution from base R.

a <- c(1,1,0,0,1,2,0,0)

# Get run length encoding
b <- rle(a)

# Create a data frame
dt <- data.frame(number = b$values, lengths = b$lengths)
# Get the end
dt$end <- cumsum(dt$lengths)
# Get the start
dt$start <- dt$end - dt$lengths + 1

# Select columns
dt <- dt[, c("number", "start", "end")]
# Sort rows
dt <- dt[order(dt$number), ]

dt
#  number start end
#2      0     3   4
#5      0     7   8
#1      1     1   2
#3      1     5   5
#4      2     6   6

Update

Here is a solution using with to make the code more concise.

with(rle(a), data.frame(number = values,
                        start = cumsum(lengths) - lengths + 1,
                        end = cumsum(lengths))[order(values),])
#  number start end
#2      0     3   4
#5      0     7   8
#1      1     1   2
#3      1     5   5
#4      2     6   6
like image 168
www Avatar answered Nov 15 '22 23:11

www


By using dplyr and rleid from data.table

library(data.table)
library(dplyr)
a=c(1,1,0,0,1,2,0,0)
df=data.frame(number=c(1,1,0,0,1,2,0,0))
df$Id=data.table::rleid(df$number)
df$rowname=seq(1:length(a))
df%>%group_by(Id,number)%>%summarise(start=first(rowname),end=last(rowname))%>%arrange(number)

# Groups:   Id [5]
     Id number start   end
  <int> <dbl> <int> <int>
1     2     0     3     4
2     5     0     7     8
3     1     1     1     2
4     3     1     5     5
5     4     2     6     6
like image 33
BENY Avatar answered Nov 15 '22 23:11

BENY


A solution using a for loop in base R:

a <- c(1, 1, 0, 0, 1, 2, 0, 0)

start <- 1
res <- data.frame()
v <- c(a, -1) # add number that is different from all other numbers

for (index in 1:(length(v) - 1)) {
  if (v[index] != v[index + 1]) {
    res <- rbind(res, 
                 data.frame(element = v[index], start = start, stop = index))
    start <- index + 1
  }
}

Which gives:

  element start stop
1       1     1    2
2       0     3    4
3       1     5    5
4       2     6    6
5       0     7    8
like image 24
Stijn Avatar answered Nov 15 '22 21:11

Stijn