Get indexes of a vector of numbers in another vector

Question

Let's suppose we have the following vector:

v <- c(2,2,3,5,8,0,32,1,3,12,5,2,3,5,8,33,1)

Given a sequence of numbers, for instance c(2,3,5,8), I am trying to find what the position of this sequence of numbers is in the vector v. The result I expect is something like:

FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE FALSE FALSE

I am trying to use which(v == c(2,3,5,8)), but it doesn't give me what I am looking for.

talat · Accepted Answer

Using base R you could do the following:

v <- c(2,2,3,5,8,0,32,1,3,12,5,2,3,5,8,33,1)
x <- c(2,3,5,8)

idx <- which(v == x[1])
idx[sapply(idx, function(i) all(v[i:(i+(length(x)-1))] == x))]
# [1]  2 12

This tells you that the exact sequence appears twice, starting at positions 2 and 12 of your vector v.

It first checks the possible starting positions, i.e. where v equals the first value of x and then loops through these positions to check if the values after these positions also equal the other values of x.

Jaap · Answer

Two other approaches using the shift-function trom data.table:

library(data.table)

# option 1
which(rowSums(mapply('==',
                     shift(v, type = 'lead', n = 0:(length(x) - 1)),
                     x)
              ) == length(x))

# option 2
which(Reduce("+", Map('==',
                      shift(v, type = 'lead', n = 0:(length(x) - 1)),
                      x)
             ) == length(x))

both give:

[1]  2 12

To get a full vector of the matching positions:

l <- length(x)
w <- which(Reduce("+", Map('==',
                           shift(v, type = 'lead', n = 0:(l - 1)),
                           x)
                  ) == l)
rep(w, each = l) + 0:(l-1)

which gives:

[1]  2  3  4  5 12 13 14 15

The benchmark which was included earlier in this answer has been moved to a separate community wiki answer.

Used data:

v <- c(2,2,3,5,8,0,32,1,3,12,5,2,3,5,8,33,1)
x <- c(2,3,5,8)

jogo · Answer

You can use rollapply() from zoo

v <- c(2,2,3,5,8,0,32,1,3,12,5,2,3,5,8,33,1)
x <- c(2,3,5,8)

library("zoo")
searchX <- function(x, X) all(x==X)
rollapply(v, FUN=searchX, X=x, width=length(x))

The result TRUEshows you the beginning of the sequence.
The code could be simplified to rollapply(v, length(x), identical, x) (thanks to G. Grothendieck):

set.seed(2)
vl <- as.numeric(sample(1:10, 1e6, TRUE))
# vm <- vl[1:1e5]
# vs <- vl[1:1e4]
x <- c(2,3,5)

library("zoo")
searchX <- function(x, X) all(x==X)
i1 <- rollapply(vl, FUN=searchX, X=x, width=length(x))
i2 <- rollapply(vl, width=length(x), identical, y=x)

identical(i1, i2)

For using identical() both arguments must be of the same type (num and int are not the same).
If needed == coerces int to num; identical() does not any coercion.

Frank · Answer

I feel like looping should be efficient:

w = seq_along(v)
for (i in seq_along(x)) w = w[v[w+i-1L] == x[i]]

w 
# [1]  2 12

This should be writable in C++ following @SymbolixAU approach for extra speed.

A basic comparison:

# create functions for selected approaches
redjaap <- function(v,x)
  which(Reduce("+", Map('==', shift(v, type = 'lead', n = 0:(length(x) - 1)), x)) == length(x))
loop <- function(v,x){
  w = seq_along(v)
  for (i in seq_along(x)) w = w[v[w+i-1L] == x[i]]
  w
}

# check consistency
identical(redjaap(v,x), loop(v,x))
# [1] TRUE

# check speed
library(microbenchmark)
vv <- rep(v, 1e4)
microbenchmark(redjaap(vv,x), loop(vv,x), times = 100)
# Unit: milliseconds
#            expr      min       lq      mean   median       uq       max neval cld
#  redjaap(vv, x) 5.883809 8.058230 17.225899 9.080246 9.907514  96.35226   100   b
#     loop(vv, x) 3.629213 5.080816  9.475016 5.578508 6.495105 112.61242   100  a 

# check consistency again
identical(redjaap(vv,x), loop(vv,x))
# [1] TRUE

Get indexes of a vector of numbers in another vector

Tags:

r

vector

eirvine

4 Answers

talat

Jaap

jogo

Frank

Recent Activity

Donate For Us

Get indexes of a vector of numbers in another vector

Tags:

r

vector

eirvine

4 Answers

talat

Jaap

jogo

Frank

Related questions

Recent Activity

Donate For Us