is this a bug? When I am rounding at 4 decimal, it actually returns different result.
import pandas as pd
pd.set_option('precision', 10)
pd.DataFrame([[1.446450001],[1.44645]]).round(4)
result
0
0 1.4465
1 1.4464
It's not a bug - rather, it's an undocumented quirk.
DataFrame.round uses numpy.around under the hood, which:
For values exactly halfway between rounded decimal values, Numpy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc.
http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.around.html
More readings @ Wikipedia: https://en.wikipedia.org/wiki/Rounding#Round_half_to_even
There are two different rounding strategies
The first rounds like you may have learned it in school, values at the exact half of an interval (ending with 5
) are rounded upwards
The second rounds to the next even number
The first strategy has the side effect, that your in the mean have a positive bias, because the center is always tuned higher. This is fixed by the second strategy with the arbitrary decision to round to the next even value.
Pandas chose to use numpy.around
which implements the second strategy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With