Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy's float32 and float comparisons

Continuing from Difference between Python float and numpy float32:

import numpy as np

a = 58682.7578125
print(type(a), a)
float_32 = np.float32(a)
print(type(float_32), float_32)
print(float_32 == a)

Prints:

<class 'float'> 58682.7578125
<class 'numpy.float32'> 58682.8
True

I fully understand that comparing floats for equality is not a good idea but still shouldn't this be False (we're talking about differences in the first decimal digit, not in 0.000000001) ? Is it system dependent ? Is this behavior somewhere documented ?

EDIT: Well it's the third decimal:

print(repr(float_32), repr(a))
# 58682.758 58682.7578125

but can I trust repr ? How are those stored internally in the final end ?

EDIT2: people insist that printing float_32 with more precision will give me its representation. However as I already commented according to nympy's docs:

the % formatting operator requires its arguments to be converted to standard python types

and:

print(repr(float(float_32)))

prints

58682.7578125

An interesting insight is given by @MarkDickinson here, apparently repr should be faithful (then he says it's not faithful for np.float32).

So let me reiterate my question as follows:

  • How can I get at the exact internal representation of float_32 and a in the example ? If these are the same, then problem solved if not,
  • What are the exact rules for up/downcasting in a comparison between python's float and np.float32 ? I 'd guess that it upcasts float_32 to float although @WillemVanOnsem suggests in the comments it's the other way round

My python version:

Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32

like image 711
Mr_and_Mrs_D Avatar asked Jul 04 '17 08:07

Mr_and_Mrs_D


1 Answers

The numbers compare equal because 58682.7578125 can be exactly represented in both 32 and 64 bit floating point. Let's take a close look at the binary representation:

32 bit:  01000111011001010011101011000010
sign    :  0
exponent:  10001110
fraction:  11001010011101011000010

64 bit:  0100000011101100101001110101100001000000000000000000000000000000
sign    :  0
exponent:  10000001110
fraction:  1100101001110101100001000000000000000000000000000000

They have the same sign, the same exponent, and the same fraction - the extra bits in the 64 bit representation are filled with zeros.

No matter which way they are cast, they will compare equal. If you try a different number such as 58682.7578124 you will see that the representations differ at the binary level; 32 bit looses more precision and they won't compare equal.

(It's also easy to see in the binary representation that a float32 can be upcast to a float64 without any loss of information. That is what numpy is supposed to do before comparing both.)

import numpy as np

a = 58682.7578125
f32 = np.float32(a)
f64 = np.float64(a)

u32 = np.array(a, dtype=np.float32).view(dtype=np.uint32)
u64 = np.array(a, dtype=np.float64).view(dtype=np.uint64)

b32 = bin(u32)[2:]
b32 = '0' * (32-len(b32)) + b32  # add leading 0s
print('32 bit: ', b32)
print('sign    : ', b32[0])
print('exponent: ', b32[1:9])
print('fraction: ', b32[9:])
print()

b64 = bin(u64)[2:]
b64 = '0' * (64-len(b64)) + b64  # add leading 0s
print('64 bit: ', b64)
print('sign    : ', b64[0])
print('exponent: ', b64[1:12])
print('fraction: ', b64[12:])
like image 196
MB-F Avatar answered Sep 23 '22 21:09

MB-F