This question is more for curiosity.
I'm creating the following array:
A = zeros((2,2))
for i in range(2):
A[i,i] = 0.6
A[(i+1)%2,i] = 0.4
print A
>>>
[[ 0.6 0.4]
[ 0.4 0.6]]
Then, printing it:
for i,c in enumerate(A):
for j,d in enumerate(c):
print j, d
But, if I remove the j, I got:
>>>
0 0.6
1 0.4
0 0.4
1 0.6
But if I remove the j from the for, I got:
(0, 0.59999999999999998)
(1, 0.40000000000000002)
(0, 0.40000000000000002)
(1, 0.59999999999999998)
It because the way I'm creating the matrix, using 0.6? How does it represent internally real values?
There are a few different things going on here.
First, Python has two mechanisms for turning an object into a string, called repr
and str
. repr
is supposed to give 'faithful' output that would (ideally) make it easy to recreate exactly that object, while str
aims for more human-readable output. For floats in Python versions up to and including Python 3.1, repr
gives enough digits to determine the value of the float completely (so that evaluating the returned string gives back exactly that float), while str
rounds to 12 decimal places; this has the effect of hiding inaccuracies, but means that two distinct floats that are very close together can end up with the same str
value - something that can't happen with repr
. When you print an object, you get the str
of that object. In contrast, when you just evaluate an expression at the interpreter prompt, you get the repr
.
For example (here using Python 2.7):
>>> x = 1.0 / 7.0
>>> str(x)
'0.142857142857'
>>> repr(x)
'0.14285714285714285'
>>> print x # print uses 'str'
0.142857142857
>>> x # the interpreter read-eval-print loop uses 'repr'
0.14285714285714285
But also, a little bit confusingly from your point of view, we get:
>>> x = 0.4
>>> str(x)
'0.4'
>>> repr(x)
'0.4'
That doesn't seem to tie in too well with what you were seeing above, but we'll come back to this below.
The second thing to bear in mind is that in your first example, you're printing two separate items, while in your second example (with the j
removed), you're printing a single item: a tuple of length 2. Somewhat surprisingly, when converting a tuple for printing with str
, Python nevertheless uses repr
to compute the string representation of the elements of that tuple:
>>> x = 1.0 / 7.0
>>> print x, x # print x twice; uses str(x)
0.142857142857 0.142857142857
>>> print(x, x) # print a single tuple; uses repr(x)
(0.14285714285714285, 0.14285714285714285)
That explains why you're seeing different results in the two cases, even though the underlying floats are the same.
But there's one last piece to the puzzle. In Python >= 2.7, we saw above that for the particular float 0.4
, the str
and repr
of that float were the same. So where does the 0.40000000000000002
come from? Well, you don't have Python floats here: because you're getting these values from a NumPy array, they're actually of type numpy.float64
:
>>> from numpy import zeros
>>> A = zeros((2, 2))
>>> A[:] = [[0.6, 0.4], [0.4, 0.6]]
>>> A
array([[ 0.6, 0.4],
[ 0.4, 0.6]])
>>> type(A[0, 0])
<type 'numpy.float64'>
That type still stores a double-precision float, just like Python's float, but it's got some extra goodies that make it interact nicely with the rest of NumPy. And it turns out that NumPy uses a slightly different algorithm for computing the repr
of a numpy.float64
than Python uses for computing the repr
of a float
. Python (in versions >= 2.7) aims to give the shortest string that still gives an accurate representation of the float, while NumPy simply outputs a string based on rounding the underlying value to 17 significant digits. Going back to that 0.4
example above, here's what NumPy does:
>>> from numpy import float64
>>> x = float64(1.0 / 7.0)
>>> str(x)
'0.142857142857'
>>> repr(x)
'0.14285714285714285'
>>> x = float64(0.4)
>>> str(x)
'0.4'
>>> repr(x)
'0.40000000000000002'
So these three things together should explain the results you're seeing. Rest assured that this is all completely cosmetic: the underlying floating-point value is not being changed in any way; it's just being displayed differently by the four different possible combinations of str
and repr
for the two types: float
and numpy.float64
.
The Python tutorial give more details of how Python floats are stored and displayed, together with some of the potential pitfalls. The answers to this SO question have more information on the difference between str
and repr
.
Don't mind me, I failed to realise that the question was about NumPy.
The strange 0.59999999999999998
and friends is Python's best attempt to accurately represent how all computers store floating point values: as a bunch of bits, according to the IEEE 754 standard. Notably, 0.1
is a non-terminating decimal in binary, and so cannot be stored exactly. (So, presumably, are 0.6
and 0.4
.)
The reason you normally see 0.6
is most floating-point printing functions round off these imprecisely-stored floats, to make them more understandable to us humans. That's what your first printing example is doing.
Under some circumstances (that is, when the printing functions aren't trying for human-readable), the full, slightly-off number 0.59999999999999998
will be printed. That's what your second printing example is doing.
This is not Python's fault; it is just how floats are stored.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With