The code
import numpy as np
a = 5.92270987499999979065
print(round(a, 8))
print(round(np.float64(a), 8))
gives
5.92270987
5.92270988
Any idea why?
Found nothing relevant in numpy sources.
Update:
I know that the proper way to deal with this problem is to construct programs in such a way that this difference is irrelevant. Which I do. I stumbled into it in regression testing.
Update2:
Regarding the @VikasDamodar comment. One shouldn't trust the repr()
function:
>>> np.float64(5.92270987499999979065)
5.922709875
>>> '%.20f' % np.float64(5.92270987499999979065)
'5.92270987499999979065'
Update3:
Tested on python3.6.0 x32, numpy 1.14.0, win64. Also on python3.6.4 x64, numpy 1.14.0, debian.
Update4:
Just to be sure:
import numpy as np
a = 5.92270987499999979065
print('%.20f' % round(a, 8))
print('%.20f' % round(np.float64(a), 8))
5.92270987000000026512
5.92270988000000020435
Update5:
The following code demonstrates on which stage the difference takes place without using str
:
>>> np.float64(a) - 5.922709874
1.000000082740371e-09
>>> a - 5.922709874
1.000000082740371e-09
>>> round(np.float64(a), 8) - 5.922709874
6.000000496442226e-09
>>> round(a, 8) - 5.922709874
-3.999999442783064e-09
Clearly, before applying 'round' they were the same number.
Update6:
In contrast to @user2357112's answer, np.round
is roughly 4 times slower than round:
%%timeit a = 5.92270987499999979065
round(a, 8)
1.18 µs ± 26.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%%timeit a = np.float64(5.92270987499999979065)
round(a, 8)
4.05 µs ± 43.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Also in my opinion np.round
did a better job rounding to the nearest even than builtin round
: originally I got this 5.92270987499999979065 number through dividing 11.84541975 by two.
float.__round__
takes special care to produce correctly-rounded results, using a correctly-rounded double-to-string algorithm.
NumPy does not. The NumPy docs mention that
Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R9] and errors introduced when scaling by powers of ten.
This is faster, but produces more rounding error. It leads to errors like what you've observed, as well as errors where numbers even more unambiguously below the cutoff still get rounded up:
>>> x = 0.33499999999999996
>>> x
0.33499999999999996
>>> x < 0.335
True
>>> x < Decimal('0.335')
True
>>> x < 0.67/2
True
>>> round(x, 2)
0.33
>>> numpy.round(x, 2)
0.34000000000000002
You're getting a slower time for NumPy's rounding, but that doesn't have anything to do with which rounding algorithm is slower. Any time comparison between NumPy and regular Python math will boil down to the fact that NumPy is optimized for whole-array operations. Doing math on single NumPy scalars has a lot of overhead, but rounding an entire array with numpy.round
easily beats rounding a list of floats with round
:
In [6]: import numpy
In [7]: l = [i/7 for i in range(100)]
In [8]: a = numpy.array(l)
In [9]: %timeit [round(x, 1) for x in l]
59.6 µs ± 408 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [10]: %timeit numpy.round(a, 1)
5.27 µs ± 145 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
As for which one is more accurate, that's definitely float.__round__
. Your number is closer to 5.92270987 than to 5.92270988, and it's round-ties-to-even, not round-everything-to-even. There's no tie here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With