I'm battling some floating point problems in Pandas read_csv function. In my investigation, I found this:
In [15]: a = 5.9975
In [16]: a
Out[16]: 5.9975
In [17]: np.float64(a)
Out[17]: 5.9974999999999996
Why is builtin float
of Python and the np.float64
type from Python giving different results? I thought they were both C++ doubles?
float64 are numpy specific 32 and 64-bit float types. Thus, when you do isinstance(2.0, np. float) , it is equivalent to isinstance(2.0, float) as 2.0 is a plain python built-in float type... and not the numpy type.
Python's floating-point numbers are usually 64-bit floating-point numbers, nearly equivalent to np. float64 . In some unusual situations it may be useful to use floating-point numbers with more precision.
Python float values are represented as 64-bit double-precision values. 1.8 X 10308 is an approximate maximum value for any floating-point number. If it exceeds or exceeds the max value, Python returns an error with string inf (infinity).
float32 is a 32 bit number - float64 uses 64 bits. That means that float64's take up twice as much memory - and doing operations on them may be a lot slower in some machine architectures. However, float64's can represent numbers much more accurately than 32 bit floats. They also allow much larger numbers to be stored.
>>> numpy.float64(5.9975).hex()
'0x1.7fd70a3d70a3dp+2'
>>> (5.9975).hex()
'0x1.7fd70a3d70a3dp+2'
They are the same number. What differs is their representation; the Python native type uses a "sane" representation, and the NumPy type uses an accurate representation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With