Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numeric comparison difficulty in R

I'm trying to compare two numbers in R as a part of a if-statement condition:

(a-b) >= 0.5

In this particular instance, a = 0.58 and b = 0.08... and yet (a-b) >= 0.5 is false. I'm aware of the dangers of using == for exact number comparisons, and this seems related:

(a - b) == 0.5) is false, while

all.equal((a - b), 0.5) is true.

The only solution I can think of is to have two conditions: (a-b) > 0.5 | all.equal((a-b), 0.5). This works, but is that really the only solution? Should I just swear off of the = family of comparison operators forever?

Edit for clarity: I know that this is a floating point problem. More fundamentally, what I'm asking is: what should I do about it? What's a sensible way to deal with greater-than-or-equal-to comparisons in R, since the >= can't really be trusted?

like image 291
Matt Parker Avatar asked May 04 '10 22:05

Matt Parker


4 Answers

I've never been a fan of all.equal for such things. It seems to me the tolerance works in mysterious ways sometimes. Why not just check for something greater than a tolerance less than 0.05

tol = 1e-5

(a-b) >= (0.05-tol)

In general, without rounding and with just conventional logic I find straight logic better than all.equal

If x == y then x-y == 0. Perhaps x-y is not exactly 0 so for such cases I use

abs(x-y) <= tol

You have to set tolerance anyway for all.equal and this is more compact and straightforward than all.equal.

like image 50
John Avatar answered Nov 18 '22 03:11

John


You could create this as a separate operator or overwrite the original >= function (probably not a good idea) if you want to use this approach frequently:

# using a tolerance
epsilon <- 1e-10 # set this as a global setting
`%>=%` <- function(x, y) (x + epsilon > y)

# as a new operator with the original approach
`%>=%` <- function(x, y) (all.equal(x, y)==TRUE | (x > y))

# overwriting R's version (not advised)
`>=` <- function(x, y) (isTRUE(all.equal(x, y)) | (x > y))

> (a-b) >= 0.5
[1] TRUE
> c(1,3,5) >= 2:4
[1] FALSE FALSE  TRUE
like image 45
Shane Avatar answered Nov 18 '22 03:11

Shane


For completeness' sake, I'll point out that, in certain situations, you could simply round to a few decimal places (and this is kind of a lame solution by comparison to the better solution previously posted.)

round(0.58 - 0.08, 2) == 0.5
like image 12
icio Avatar answered Nov 18 '22 02:11

icio


One more comment. The all.equal is a generic. For numeric values, it uses all.equal.numeric. An inspection of this function shows that it used .Machine$double.eps^0.5, where .Machine$double.eps is defined as

double.eps: the smallest positive floating-point number ‘x’ such that
          ‘1 + x != 1’.  It equals ‘double.base ^ ulp.digits’ if either
          ‘double.base’ is 2 or ‘double.rounding’ is 0; otherwise, it
          is ‘(double.base ^ double.ulp.digits) / 2’.  Normally
          ‘2.220446e-16’.

(.Machine manual page).

In other words, that would be an acceptable choice for your tolerance:

myeq <- function(a, b, tol=.Machine$double.eps^0.5)
      abs(a - b) <= tol
like image 7
January Avatar answered Nov 18 '22 02:11

January