Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vectorized equality testing

Tags:

equality

r

I'd be surprised if this isn't a dup, but I couldn't find a solution.

I understand the limitations of == for testing equality of floating-point numbers. One should use all.equal

0.1 + 0.2 == 0.3
# FALSE
all.equal(0.1 + 0.2, 0.3)
# TRUE

But == has the advantage of being vectorized:

set.seed(1)
Df <- data.frame(x = sample(seq(-1, 1, by = 0.1), size = 100, replace = TRUE),
                 y = 0.1)
Df[Df$x > 0 & Df$x < 0.2,]
## x   y
## 44 0.1 0.1
## 45 0.1 0.1

# yet
sum(Df$x == Df$y)
# [1] 0

I can write a (bad) function myself:

All.Equal <- function(x, y){
  stopifnot(length(x) == length(y))
  out <- logical(length(x))
  for (i in seq_along(x)){
    out[i] <- isTRUE(all.equal(x[i], y[i]))
  }
  out
}

sum(All.Equal(Df$x, Df$y))

which gives the correct answer, but still has a long way to go.

microbenchmark::microbenchmark(All.Equal(Df$x, Df$y), Df$x == Df$y)
Unit: microseconds
                  expr      min        lq        mean     median        uq        max neval cld
 All.Equal(Df$x, Df$y) 9954.986 10298.127 20382.24436 10511.5360 10798.841 915182.911   100   b
          Df$x == Df$y   16.857    19.265    29.06261    30.8535    38.529     45.151   100  a 

Another option might be:

All.equal.abs <- function(x,y){
  tol <- .Machine$double.eps ^ 0.5
  abs(x - y) < tol
}

which performs comparably to ==.

What is an existing function that performs this task?

like image 700
Hugh Avatar asked Jan 30 '16 03:01

Hugh


People also ask

What is a vector equality?

Two vectors are said to be equal if. (i) they have the same magnitude and. (ii) are in the same direction. If we shift B parallel to A then it will completely superimpose A i.e it has same length and are in the same direction as A, so A =B .

How do you know if a vector is equal?

In simple words, we can say that two or more vectors are said to be equal vectors if their length is the same and they all point in the same direction. Generally, we can check for equal vectors by comparing their coordinates. If all the coordinates of two or more vectors are the same, then they are equal vectors.

How do you check for equality of a vector in R?

setequal() function in R Language is used to check if two objects are equal. This function takes two objects like Vectors, dataframes, etc. as arguments and results in TRUE or FALSE, if the Objects are equal or not.

What are vectorized operations in R?

Most of R's functions are vectorized, meaning that the function will operate on all elements of a vector without needing to loop through and act on each element one at a time. This makes writing code more concise, easy to read, and less error prone. The multiplication happened to each element of the vector.


1 Answers

Vectorize() turns out to be a slow option. As @fishtank suggests in the comment, the best solution comes from checking if the absolute difference is smaller than some tolerance value, i.e. is_equal_tol() from below.

set.seed(123)
a <- sample(1:10, size = 50, replace = T)
b <- sample(a)

is_equal_tol <- function(x, y, tol = .Machine$double.eps ^ 0.5) {
  abs(x - y) < tol
}

is_equal_vec <- Vectorize(all.equal, c("target", "current"))

is_equal_eq <- function(x, y) x == y

microbenchmark::microbenchmark(is_equal_eq(a, b),
                               is_equal_tol(a, b), 
                               isTRUE(is_equal_vec(a, b)),
                               times = 1000L)

Unit: nanoseconds
                       expr     min      lq        mean  median      uq      max neval
          is_equal_eq(a, b)       0     856    1545.797    1284    2139    14113  1000
         is_equal_tol(a, b)    1711    2567    4991.377    4278    6843    27370  1000
 isTRUE(is_equal_vec(a, b)) 2858445 3008552 3258916.503 3082964 3204204 46130260  1000
like image 111
Johan Larsson Avatar answered Nov 12 '22 06:11

Johan Larsson