I have some problems due to really low numbers used with numpy. It took me several weeks to trace back my constant problems with numerical integration to the fact, that when I add up floats in a function the float64 precision gets lost. Performing the mathematically identic calculation with a product instead of a sum leads to values that are alright.
Here is a code sample and a plot of the results:
from matplotlib.pyplot import *
from numpy import vectorize, arange
import math
def func_product(x):
return math.exp(-x)/(1+math.exp(x))
def func_sum(x):
return math.exp(-x)-1/(1+math.exp(x))
#mathematically, both functions are the same
vecfunc_sum = vectorize(func_sum)
vecfunc_product = vectorize(func_product)
x = arange(0.,300.,1.)
y_sum = vecfunc_sum(x)
y_product = vecfunc_product(x)
plot(x,y_sum, 'k.-', label='sum')
plot(x,y_product,'r--',label='product')
yscale('symlog', linthreshy=1E-256)
legend(loc='lower right')
show()
As you can see, the summed values that are quite low are scattered around zero or are exactly zero while the multiplicated values are fine...
Please, could someone help/explain? Thanks a lot!
Floating point precision is pretty sensitive to addition/subtraction due to roundoff error. Eventually, 1+exp(x)
gets so big that adding 1 to exp(x) gives the same thing as exp(x). In double precision that's somewhere around exp(x) == 1e16
:
>>> (1e16 + 1) == (1e16)
True
>>> (1e15 + 1) == (1e15)
False
Note that math.log(1e16)
is approximately 37 -- Which is roughly where things go crazy on your plot.
You can have the same problem, but on different scales:
>>> (1e-16 + 1.) == (1.)
True
>>> (1e-15 + 1.) == (1.)
False
For a vast majority of the points in your regime, your func_product
is actually calculating:
exp(-x)/exp(x) == exp(-2*x)
Which is why your graph has a nice slope of -2.
Taking it to the other extreme, you're other version is calculating (at least approximately):
exp(-x) - 1./exp(x)
which is approximately
exp(-x) - exp(-x)
This is an example of catastrophic cancellation.
Let's look at the first point where the calculation goes awry, when x = 36.0
In [42]: np.exp(-x)
Out[42]: 2.3195228302435691e-16
In [43]: - 1/(1+np.exp(x))
Out[43]: -2.3195228302435691e-16
In [44]: np.exp(-x) - 1/(1+np.exp(x))
Out[44]: 0.0
The calculation using func_product
does not subtract nearly equal numbers, so it avoids the catastrophic cancellation.
By the way, if you change math.exp
to np.exp
, you can get rid of np.vectorize
(which is slow):
def func_product(x):
return np.exp(-x)/(1+np.exp(x))
def func_sum(x):
return np.exp(-x)-1/(1+np.exp(x))
y_sum = func_sum_sum(x)
y_product = func_product_product(x)
The problem is that your func_sum
is numerically unstable because it involves a subtraction between two very close values.
In the calculation of func_sum(200)
, for example, math.exp(-200)
and 1/(1+math.exp(200))
have the same value, because adding 1
to math.exp(200)
has no effect, since it is outside the precision of 64-bit floating point:
math.exp(200).hex()
0x1.73f60ea79f5b9p+288
(math.exp(200) + 1).hex()
0x1.73f60ea79f5b9p+288
(1/(math.exp(200) + 1)).hex()
0x1.6061812054cfap-289
math.exp(-200).hex()
0x1.6061812054cfap-289
This explains why func_sum(200)
gives zero, but what about the points that lie off the x axis? These are also caused by floating point imprecision; it occasionally happens that math.exp(-x)
is not equal to 1/math.exp(x)
; ideally, math.exp(x)
is the closest floating-point value to e^x
, and 1/math.exp(x)
is the closest floating-point value to the reciprocal of the floating-point number calculated by math.exp(x)
, not necessarily to e^-x
. Indeed, math.exp(-100)
and 1/(1+math.exp(100))
are very close and in fact only differ in the last unit:
math.exp(-100).hex()
0x1.a8c1f14e2af5dp-145
(1/math.exp(100)).hex()
0x1.a8c1f14e2af5cp-145
(1/(1+math.exp(100))).hex()
0x1.a8c1f14e2af5cp-145
func_sum(100).hex()
0x1.0000000000000p-197
So what you have actually calculated is the difference, if any, between math.exp(-x)
and 1/math.exp(x)
. You can trace the line of the function math.pow(2, -52) * math.exp(-x)
to see that it passes through the positive values of func_sum
(recall that 52 is the size of the significand in 64-bit floating point).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With