What are the most common operations that would cause a NaN
, in Python, which originate while working with NumPy or SciPy?
For example:
1e500 - 1e500 >>> nan
What is the reasoning for this behavior and why does it not return 0?
In Python, NumPy with the latest version where nan is a value only for floating arrays only which stands for not a number and is a numeric data type which is used to represent an undefined value. In Python, NumPy defines NaN as a constant value.
No, you can't, at least with current version of NumPy. A nan is a special value for float arrays only.
NaN stands for Not A Number and is a common missing data representation. It is a special floating-point value and cannot be converted to any other type than float.
If you do any of the following without horsing around with the floating-point environment, you should get a NaN where you didn't have one before:
0/0
(either sign on top and bottom)inf/inf
(either sign on top and bottom)inf - inf
or (-inf) + inf
or inf + (-inf)
or (-inf) - (-inf)
0 * inf
and inf * 0
(either sign on both factors)sqrt(x)
when x < 0
fmod(x, y)
when y = 0
or x
is infinite; here fmod
is floating-point remainder.The canonical reference for these aspects of machine arithmetic is the IEEE 754 specification. Section 7.1 describes the invalid operation exception, which is the one that is raised when you're about to get a NaN. "Exception" in IEEE 754 means something different than it does in a programming language context.
Lots of special function implementations document their behaviour at singularities of the function they're trying to implement. See the man page for atan2
and log
, for instance.
You're asking specifically about NumPy and SciPy. I'm not sure whether this is simply to say "I'm asking about the machine arithmetic that happens under the hood in NumPy" or "I'm asking about eig()
and stuff." I'm assuming the former, but the rest of this answer tries to make a vague connection to the higher-level functions in NumPy. The basic rule is: If the implementation of a function commits one of the above sins, you get a NaN.
For fft
, for instance, you're liable to get NaN
s if your input values are around 1e1010
or larger and a silent loss of precision if your input values are around 1e-1010
or smaller. Apart from truly ridiculously scaled inputs, though, you're quite safe with fft
.
For things involving matrix math, NaNs can crop up (usually through the inf - inf
route) if your numbers are huge or your matrix is extremely ill-conditioned. A complete discussion of how you can get screwed by numerical linear algebra is too long to belong in an answer. I'd suggest going over a numerical linear algebra book (Trefethen and Bau is popular) over the course of a few months instead.
One thing I've found useful when writing and debugging code that "shouldn't" generate NaNs is to tell the machine to trap if a NaN occurs. In GNU C, I do this:
#include <fenv.h> feenableexcept(FE_INVALID);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With