I have the problem that a sum of variables representing probabilities in the [0,1]
interval sum to 0
but should be >0
. The problem is surely related to floating point representation and precision in R
, but I cannot pin down where it goes wrong.
options(digits = 22)
p1 = 0.8
p2 = 0.9999999999999998
p11 = p1 * p2
p10 = p1 - p11
p01 = p2 - p11
p00 = 1 - p11 - p10 - p01
p11, p10, p01
all are numeric
. p00
is also numeric
but
> p00
[1] 0
and p00 == 0
is TRUE
on my machine. However it should not be zero as it can be shown that p00
is >0
mathematically.
This problem seems to be related to the fact that p01
is small. However p01>0
is TRUE
still applies on my machine. Why does it go wrong when taking the final sum in p00
?
Is there a numerical trick to solve this problem, i.e. get an exact representation of p00
? Note again p
are probabilities in [0,1]
. I considered using log
and exp
transformations but without consistent success.
R itself can only deal with 64 bit floating point numbers, the Rmpfr
package can deal with arbitrary precision floating point numbers:
library(Rmpfr)
> p1 = mpfr("0.8", 128)
> p2 = mpfr("0.9999999999999998", 128)
> p11 = p1 * p2
> p10 = p1 - p11
> p01 = p2 - p11
> p00 = 1 - p11 - p10 - p01
> p00
1 'mpfr' number of precision 128 bits
[1] 4.00000000000000000000000461461738779728e-17
Edit: Use stings to define mpfr
You want to avoid catastrophic cancellation, which occurs when subtracting values of approximately the same size: essentially this amplifies any existing errors.
One trick is to do the subtraction first, i.e.
> p1 = 0.8
> p2 = 0.9999999999999998
> (1-p1)*(1-p2)
[1] 4.440892e-17
The remaining inaccuracy is due to the fact that 0.9999999999999998
isn't actually 0.9999999999999998, but instead is stored as 0.9999999999999997779553950749686919152736663818359375 (which is the closest representable floating point number)
Update: to explain a bit further what is going on.
Since most floating point operations cannot be done exactly, they incur some error: this is the difference between the result you get and the true mathematical result. It is inherently relative, rather than absolute (i.e. significant digits, not digits after the decimal place).
When thinking about error, you need to keep track of 2 things:
The first is not too difficult: most operations incur a small amount of relative error that is at most about 1e-16
(half a unit in the last place. There are also a couple of operations that incur no error (subtracting 2 numbers that are within 2x of each other, multilying or dividing by powers of 2).
The second is trickier, but basically want to avoid anything that would amplify existing error: the most insidious is subtracting two numbers that are roughly equal: the absolute error stays the same, but the relative error can increase dramatically. Multilication and division are inherently relative operations, so don't have any effect here.
In your original code, you are doing a lot of such subtractions with quantities that have already accumulated some error. In my modification, I do the subtractions first, so this error is minimised (though in the case of p2
, you can see that the result is still drastically influenced by the error in conversion from the decimal string "0.9999999999999998" to the binary floating point number).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With