Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas rounding, is this a bug?

is this a bug? When I am rounding at 4 decimal, it actually returns different result.

import pandas as pd
pd.set_option('precision', 10)

pd.DataFrame([[1.446450001],[1.44645]]).round(4)

result

    0
0   1.4465
1   1.4464
like image 512
JOHN Avatar asked Feb 08 '23 00:02

JOHN


2 Answers

It's not a bug - rather, it's an undocumented quirk.

DataFrame.round uses numpy.around under the hood, which:

For values exactly halfway between rounded decimal values, Numpy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc.

http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.around.html

More readings @ Wikipedia: https://en.wikipedia.org/wiki/Rounding#Round_half_to_even

like image 91
zw324 Avatar answered Feb 20 '23 08:02

zw324


There are two different rounding strategies

  • The first rounds like you may have learned it in school, values at the exact half of an interval (ending with 5) are rounded upwards

  • The second rounds to the next even number

The first strategy has the side effect, that your in the mean have a positive bias, because the center is always tuned higher. This is fixed by the second strategy with the arbitrary decision to round to the next even value.

Pandas chose to use numpy.around which implements the second strategy.

like image 27
MaxNoe Avatar answered Feb 20 '23 08:02

MaxNoe