Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Overflow issues when implementing math formulas

I heard that, when computing mean value, start+(end-start)/2 differs from (start+end)/2 because the latter can cause overflow. I do not quite understand why this second one can cause overflow while the first one does not. What are the generic rule to implement a math formula that can avoid overflow.

like image 365
bit-question Avatar asked Dec 02 '22 22:12

bit-question


2 Answers

Suppose you are using a computer where the maximum integer value is 10 and you want to compute the average of 5 and 7.

The first method (begin + (end-begin)/2) gives

5 + (7-5)/2 == 5 + 2/2 == 6

The second method (begin + end)/2 gives an overflow, since the intermediate 12 value is over the maximum value of 10 that we accept and "wraps over" to something else (if you are using unsigned numbers its usual to wrap back to zero but if your numbers are signed you could get a negative number!).

12/2 => overflow occurs => 2/2 == 1

Of course, in real computers integers overflow at a large value like 2^32 instead of 10, but the idea is the same. Unfortunately, there is no "general" way to get rid of overflow that I know of, and it greatly depends on what particular algorithm you are using. And event then, things get more complicated. You can get different behaviour depending on what number type you are using under the hood and there are other kinds of numerical errors to worry about in addition to over and underflow.

like image 189
hugomg Avatar answered Dec 14 '22 23:12

hugomg


Both your formulas will overflow, but under different circumstances:

  • The (start+end) part of your (start+end)/2 formula will overflow when start and end are both close to the integer limit on the same side of the range (i.e. both positive or both negative).
  • The (end-start) part of your start+(end-start)/2 formula will overflow when start is positive and end is negative, and both values are close to the respective ends of the representable integer values.

There are no "generic" rules, you do it case-by-case: look at parts of your formula, think of situations that could cause overflow, and come up with ways to avoid it. For example, the start+(end-start)/2 formula can be shown to avoid overflow when you average values with the same sign.

This is the hard way; the easy way is to use higher-capacity representations for intermediate results. For example, if you use long long instead of int to do intermediate calculations and copy the results back to int only when you are done, you will avoid overflow assuming that the end result fits in an int.

like image 20
Sergey Kalinichenko Avatar answered Dec 15 '22 00:12

Sergey Kalinichenko