Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do you reassign == and != to isTRUE( all.equal() )?

Tags:

r

A previous post prompted me to post this question. It would seem like a best-practice to reassign == to isTRUE(all.equal()) ( and != to !isTRUE(all.equal()). I'm wondering if others do this in practice? I just realized that I use == and != to do numeric equality throughout my codebase. My first reaction was that I need to do a full-scrub and convert to all.equal. But in fact, everytime I use == and != I want to test equality (regardless of the datatype). In fact, I'm not sure what these operations would test for other than equality. I'm sure I'm missing some concept here. Can someone enlighten me? The only argument I see against this approach is that in some cases two non-identical numbers will appear to be identical because of the tolerance of all.equal. But we're told that two numbers that are in fact identical might not pass identical() because of how they are are stored in memory. So really what's the point of not defaulting to all.equal?

like image 400
SFun28 Avatar asked Oct 05 '11 15:10

SFun28


1 Answers

As @joran alluded to, you'll run into floating point issues with == and != in pretty much any other language too. One important aspect of them in R is the vectorization part.

It would be much better to define a new function almostEqual, fuzzyEqual or similar. It is unfortunate that there is no such base function. all.equal isn't very efficient since it handles all kinds of objects and returns a string describing the difference when mostly you just want TRUE or FALSE.

Here's an example of such a function. It's vectorized like ==.

almostEqual <- function(x, y, tolerance=1e-8) {
  diff <- abs(x - y)
  mag <- pmax( abs(x), abs(y) )
  ifelse( mag > tolerance, diff/mag <= tolerance, diff <= tolerance)
}

almostEqual(1, c(1+1e-8, 1+2e-8)) # [1]  TRUE FALSE

...it is around 2x faster than all.equal for scalar values, and much faster with vectors.

x <- 1
y <- 1+1e-8
system.time(for(i in 1:1e4) almostEqual(x, y)) # 0.44 seconds
system.time(for(i in 1:1e4) all.equal(x, y))   # 0.93 seconds
like image 177
Tommy Avatar answered Nov 09 '22 23:11

Tommy