Noticed some nan's were appearing unexpectedly, in my data. (and expanding out and naning everything they touched) Did some careful investigation and produced a minimal working example:
>>> import numpy
>>> from scipy.special import expit
>>> expit(709)
1.0
>>> expit(710)
nan
Expit is the inverse logit. Scipy documentation here.
Which tells us:
expit(x) = 1/(1+exp(-x))
So 1+exp(-709)==1.0
so that expit(709)=1.0
Seems fairly reasonable, rounding exp(-709)==0
.
However, what is going on with expit(710)
? expit(710)==nan
implies that 1+exp(-710)==0
, which implies: exp(-710)=-1
which is not right at all.
What is going on?
I am fixing it with:
def sane_expit(x):
x = np.minimum(x,700*np.ones_like(x)) #Cap it at 700 to avoid overflow
return expit(x)
But this is going to be a bit slower, because extra op, and the python overhead.
I am using numpy 1.8.-0, and scipy 0.13.2
What is going on?
The function is evidently not coded to deal with such large inputs, and encounters an overflow during the internal calculations.
The significance of the number 710 is that math.exp(709)
can be represented as float
, whereas math.exp(710)
cannot:
In [27]: import math
In [28]: math.exp(709)
Out[28]: 8.218407461554972e+307
In [29]: math.exp(710)
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
----> 1 math.exp(710)
OverflowError: math range error
Might be worth filing a bug against SciPy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With