Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest way to perform multiple logical comparisons in R?

What is the fastest way to perform multiple logical comparisons in R?

Consider for example the vector x

set.seed(14)
x = sample(LETTERS[1:4], size=10, replace=TRUE)

I want to test if each entry of x is either a "A" or a "B" (and not anything else). The following works

x == "A" | x == "B"
[1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE

The above code loops three times through the length of the whole vector. Is there a way in R to loop only once and test for each item whether it satisfies one or another condition?

like image 411
Remi.b Avatar asked Dec 29 '15 22:12

Remi.b


People also ask

How do I combine logical vectors in R?

Once you have multiple logical vectors, you can combine them together using Boolean algebra. In R, & is “and”, | is “or”, and ! is “not”, and xor() is exclusive or2.

What is the difference between & and && in R?

'&' and '&&' indicate logical AND and '|' and '||' indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined.

What does && mean in R?

& and && indicate logical AND and | and || indicate logical OR. The shorter forms performs elementwise comparisons in much the same way as arithmetic operators. The longer forms evaluates left to right, proceeding only until the result is determined.

How do you use Boolean operators in R?

R Boolean With Comparison Operators For example, to check if two numbers are equal, you can use the == operator. Similarly, to check if x is less than y , you can use the < operator. Since, the value stored in x is less than the value stored in y , the comparison x < y results in TRUE .


1 Answers

If your objective is just to make a single pass, that is pretty straightforward to write in Rcpp, even if you don't have much experience with C++:

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::LogicalVector single_pass(Rcpp::CharacterVector x, Rcpp::String a, Rcpp::String b) {
  R_xlen_t i = 0, n = x.size();
  Rcpp::LogicalVector result(n);

  for ( ; i < n; i++) {
    result[i] = (x[i] == a || x[i] == b);
  }

  return result;
}

For such a small object as the one used in your example, the slight overhead of .Call (presumably) masks the speed of the Rcpp version,

r_fun <- function(X) X == "A" | X == "B"
##
cpp_fun <- function(X) single_pass(X, "A", "B")
##
all.equal(r_fun(x), cpp_fun(x))
#[1] TRUE
microbenchmark::microbenchmark(
  r_fun(x), cpp_fun(x), times = 1000L)
#Unit: microseconds
#expr         min    lq     mean median     uq    max neval
#r_fun(x)   1.499 1.584 1.974156 1.6795 1.8535 37.903  1000
#cpp_fun(x) 1.860 2.334 3.042671 2.7450 3.1140 51.870  1000

But for larger vectors (I'm assuming this is your real intention), it is considerably faster:

x2 <- sample(LETTERS, 10E5, replace = TRUE)
##
all.equal(r_fun(x2), cpp_fun(x2))
# [1] TRUE
microbenchmark::microbenchmark(
  r_fun(x2), cpp_fun(x2), times = 200L)
#Unit: milliseconds
#expr              min        lq      mean    median        uq      max neval
#r_fun(x2)   78.044518 79.344465 83.741901 80.999538 86.368627 149.5106   200
#cpp_fun(x2)  7.104929  7.201296  7.797983  7.605039  8.184628  10.7250   200

Here's a quick attempt at generalizing the above, if you have any use for it.

like image 142
nrussell Avatar answered Nov 14 '22 23:11

nrussell