Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Midpoint Formula Overflow Error

I'm learning algorithms/big o and i was just curious about this.

The use of

 mid = (low+high)/2;

in order to get the midpoint for a binary search algorithm is generally discouraged because of the possibility of an overflow error. Why would this cause an overflow error to occur, and how does

 mid = low + (high-low)/2;

prevent this error?

Thanks.

like image 400
hmir Avatar asked Jun 19 '14 22:06

hmir


People also ask

How do you find the midpoint without overflow?

Instead we can use the below formula to calculate the mid index: mid = start + (end-start)/2; Clearly “(end-start)/2” is less than “end” and this formula will not cause integer overflow for large values of start and end.

What happens when integer overflow?

An integer overflow can cause the value to wrap and become negative, which violates the program's assumption and may lead to unexpected behavior (for example, 8-bit integer addition of 127 + 1 results in −128, a two's complement of 128).

How do you find the midpoint of index?

So, the correct way of calculating mid in a binary search is mid = low + ((high - low) / 2) in order to handle overflow errors.

How is the midpoint of an ordered list calculated in binary search?

The midpoint is found by adding the lowest position to the highest position and dividing by 2. NOTE - if the answer is a decimal, round up. For example, 3.5 becomes 4. An alternative is to round down, but be consistent.


2 Answers

In the first case you calculate the value (low+high) which might be too huge to fit into an int, if low and high are both huge enough (say if both are equal to 2^30+1 /or even greater/). In the second case, you don't calculate (low+high), you do a small trick and you go through the expression (high-low) and that expression is much safer with respect to int overflow.

Still, if you don't have an array whose size is greater than 2^30 (which is quite a huge array anyway), I don't see how you can run into an int overflow even when using the first expression. So I would just use the first one in most cases and not worry.

like image 54
peter.petrov Avatar answered Oct 22 '22 08:10

peter.petrov


Lets assume the worst situation for integer. We are sure that mid = (low + high)/2 can never give a overflow if we look at the bigger picture. Assuming that low = 2ˆ31 - 1 and high = 2ˆ31 - 1 ( the highest integer values) the total calculation will be [2*(2ˆ31-1)] / 2 = 2ˆ31 - 1. ( the biggest number int can hold)

However the way we calculate it matters.

  • If it is calculated using (low + high)/2 it will simply give an overflow in our situation because first it will try to sum low( 2ˆ31 -1 ) and high ( 2ˆ31 -1 ) and BAM. 2*(2ˆ31-1) is an overflow. It can not divide it by two because it can not store this value initially in an int.

  • If it is calculated using low + (high-low)/2 it can not give an overflow . Think it that way, low + (high - low)/2 = (low + x) where x = (high - low)/2. In order to create an overflow we should pick big (in our situation biggest for int) numbers for both sides. However if we pick low the biggest, then x will be the smallest( because we are subtracting low from high in x) and vice versa.

like image 38
epipav Avatar answered Oct 22 '22 07:10

epipav