I want to know if two vectors have any elements in common. I don't care what the elements are, how many common elements there are, or what positions they are at within either vector. I just need a simple, efficient function EIC(vec1, vec2)
that returns TRUE if there exists some element in both vec1
and vec2
, FALSE if there are no elements common to both. Also we can assume that neither vec1
nor vec2
contain NA
, but either may have duplicated values.
I've thought of five ways to do this, but they all seem inefficient:
EIC.1 <- function(vec1, vec2) length(intersect(vec1, vec2)) > 0
# I want a function that will stop when it finds the first
# common element between the vectors, and return TRUE. The
# intersect function will continue on and check whether there are
# any other common elements.
EIC.2 <- function(vec1, vec2) any(vec1 %in% vec2)
EIC.3 <- function(vec1, vec2) any(!is.na(match(vec1, vec2)))
# the match function goes to the trouble of finding the position
# of all matches; I don't need the position but just want to know
# if any exist
EIC.4 <- function(vec1, vec2) {
uvec1 <- unique(vec1)
uvec2 <- unique(vec2)
length(unique(c(uvec1, uvec2))) < length(uvec1) + length(uvec2)
}
EIC.5 <- function(vec1, vec2) !!anyDuplicated(c(unique(vec1), unique(vec2)))
# per https://stackoverflow.com/questions/5263498/how-to-test-whether-a-vector-contains-repetitive-elements#comment5931428_5263593
# I suspect this is the most efficient of the five, because
# anyDuplicated will stop looking when it comes to the first one,
# but I'm not sure about using !! to coerce to boolean type
I will be using very long vectors (without any NAs, as previously mentioned) and will be running this function millions of times, which is why I am looking for something efficient. Here is some test data:
v1 <- c(9, 8, 75, 62)
v2 <- c(20, 75, 341, 987, 8)
v3 <- c(154, 62, 62, 143, 154, 95)
v4 <- c(12, 62, 12)
EIC <- EIC.1
EIC(v1, v2)
EIC(v1, v3)
EIC(v1, v4)
EIC(v2, v3)
EIC(v2, v4)
EIC(v3, v4)
Correct results are TRUE, TRUE, TRUE, FALSE, FALSE, TRUE.
Check if Two Objects are Equal in R Programming – setequal() Function. setequal() function in R Language is used to check if two objects are equal. This function takes two objects like Vectors, dataframes, etc. as arguments and results in TRUE or FALSE, if the Objects are equal or not.
The difference (A-B) between two vectors in R Programming is equivalent to the elements present in A which are not present in B. The resultant elements are always a subset of A. In case, both sets are non-intersecting, the entire A set is returned.
intersect() function is used to return the common element present in two vectors. Thus, the two vectors are compared, and if a common element exists it is displayed.
To do this intersect() method is used. It is used to return the common elements from two objects. where, vector is the input data. If there are more than two vectors then we can combine all these vectors into one except one vector.
Not really an answer, just some comments:
match()
at some point:intersect <- function (x, y)
{
y <- as.vector(y)
unique(y[match(as.vector(x), y, 0L)])
}
%in%
and as such is very close to EIC.3`%in%` <- function (x, table) match(x, table, nomatch = 0L) > 0L
You could shave a bit of time on some cases with this:
EIC.all <- function(vec1, vec2) !all(is.na(match(vec1, vec2)))
because the negation !
is performed on a scalar instead of a vector of size length(vec1).
What you need is a C/C++ function that does the exact same thing as the match
internal function but stops at the first match
.
You could have a look at the mach5 C function: https://github.com/wch/r-source/blob/d1f8ef492464fd68320be9581bde4b09eadc03d6/src/main/unique.c#L1332
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With