It seems there is an error in round function. Below I would expect it to return 6, but it returns 5.
round(5.5)
# 5
Other then 5.5, such as 6.5, 4.5 returns 7, 5 as we expect.
Any explanation?
This behaviour is explained in the help file of the ?round
function:
Note that for rounding off a 5, the IEC 60559 standard is expected to be used, ‘go to the even digit’. Therefore round(0.5) is 0 and round(-1.5) is -2. However, this is dependent on OS services and on representation error (since e.g. 0.15 is not represented exactly, the rounding rule applies to the represented number and not to the printed number, and so round(0.15, 1) could be either 0.1 or 0.2).
round( .5 + 0:10 )
#### [1] 0 2 2 4 4 6 6 8 8 10 10
Another relevant email exchange by Greg Snow: R: round(1.5) = round(2.5) = 2?:
The logic behind the round to even rule is that we are trying to represent an underlying continuous value and if x comes from a truly continuous distribution, then the probability that x==2.5 is 0 and the 2.5 was probably already rounded once from any values between 2.45 and 2.54999999999999..., if we use the round up on 0.5 rule that we learned in grade school, then the double rounding means that values between 2.45 and 2.50 will all round to 3 (having been rounded first to 2.5). This will tend to bias estimates upwards. To remove the bias we need to either go back to before the rounding to 2.5 (which is often impossible to impractical), or just round up half the time and round down half the time (or better would be to round proportional to how likely we are to see values below or above 2.5 rounded to 2.5, but that will be close to 50/50 for most underlying distributions). The stochastic approach would be to have the round function randomly choose which way to round, but deterministic types are not comforatable with that, so "round to even" was chosen (round to odd should work about the same) as a consistent rule that rounds up and down about 50/50.
If you are dealing with data where 2.5 is likely to represent an exact value (money for example), then you may do better by multiplying all values by 10 or 100 and working in integers, then converting back only for the final printing. Note that 2.50000001 rounds to 3, so if you keep more digits of accuracy until the final printing, then rounding will go in the expected direction, or you can add 0.000000001 (or other small number) to your values just before rounding, but that can bias your estimates upwards.
When I was in college, a professor of Numerical Analysis told us that the way you describe for rounding numbers is the correct one. You shouldn't always round up the number (integer).5, because it is equally distant from the (integer) and the (integer + 1). In order to minimize the error of the sum (or the error of the average, or whatever), half of those situations should be rounded up and the other half should be rounded down. The R programmers seem to share the same opinion as my professor of Numerical Analysis...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With