Here I have two string vectors whose order is important and cannot be changed.
vec1 <- c("carrot","carrot","carrot","apple","apple","mango","mango","cherry","cherry")
vec2 <- c("cherry","apple")
I wish to find out if elements in vec2 appears in vec1 and if so, where (index/position) and in what order.
I tried which(vec1 %in% vec2)
which gives 4 5 8 9
. These are correct indices, but in the wrong order. I tried match(vec2,vec1)
which gives 8 4
. Only the first match is returned. This would work if vec1 was unique.
Ideally, I am looking for this result: 8 9 4 5
.
cherry is first matched at pos 8 and 9 and then apple is matched at 4 and 5.
Is there a smart way to do this without resorting to loops?
you can try this
unlist(lapply(vec2, function(x) which(vec1 %in% x)))
[1] 8 9 4 5
which will return successively the elements in vec1 present in vec2 one by one.
which(!is.na(match(vec1,vec2)))[order(match(vec1,vec2)[!is.na(match(vec1,vec2))])]
Wow...there's probably an easier way to do this but...
> match(vec1,vec2)
[1] NA NA NA 2 2 NA NA 1 1
OK, so by reversing the match, I can use which()
to get the index where it's not NA
> which(!is.na(match(vec1,vec2)))
[1] 4 5 8 9
This gets the indices you want, but not in the order you want. So if we use order
on the match()
vector it will let me re-sort to the desired value. Here, I match again, and keep only the non-NA values.
> order(match(vec1,vec2)[!is.na(match(vec1,vec2))])
[1] 3 4 1 2
Subsort by this and you get:
> which(!is.na(match(vec1,vec2)))[order(match(vec1,vec2)[!is.na(match(vec1,vec2))])]
[1] 8 9 4 5
If this is slow, save the match statement first to not do it over and over again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With