In binary search, we usually have low and high variables and typically there is a while loop that tests if low <= high, as shown in this code (from Wikipedia): <pre class="prettyprint"><code>int SortedArray[max] = {....} int BinarySearch (int key) { int start = 0; int end = max - 1; int mid; while (start <= end) { mid = (start + end) / 2; if (key == a[mid]) return mid; else if (key < a[mid]) end = mid - 1; else start = mid + 1; } return -1; } </code></pre> When learning binary search, I was always taught the start <= end approach, but when seeing other implementations, I've seen a lot of people do while(start < end). Is there an advantage to one versus the other? In my own native implementations, I do the <= approach but when I switch it out for <, the search fails. Is there a rule of thumb for using one versus the other?

even if your question is probably not super clear, I could infer you are talking about this kind of implementation of the binary search (here in C, from Wikipedia): <pre class="prettyprint"><code>int SortedArray[max] = {....} int BinarySearch (int key) { int start = 0; int end = max - 1; int mid; while (start <= end) { mid = (start + end) / 2; if (key == a[mid]) return mid; else if (key < a[mid]) end = mid - 1; else start = mid + 1; } return -1; } </code></pre> If you replace <code>start <= end</code> by <code>start < end</code>, there will be cases where your algorithm will not give the good answer. Let's think about two cases. 1 - You would like to search 1 in the list <code>[1]</code>. In that case, <code>start = 0, end = 0</code> and the algorithm would return -1 if you change the loop condition. 2 - You would like to search 2 in the list <code>[1, 2]</code>. In that case, start = 0, end = 1. The algorithm will set <code>mid = (0+1)/2=0</code> in C. Thus <code>arr[mid] < key</code>. It will make <code>start = 1, end = 1</code>. Again, if you stop the loop here, the algorithm will return -1 instead of 1. And there are probably many other examples. Have a nice day

For <code>low <= high</code>, <code>high</code> is considered inclusive (<code>high</code> is part of the range we consider). For <code>low < high</code>, <code>high</code> is considered exclusive (<code>high</code> is not part of the range we consider). Both can be correct, but there will be minor differences in the rest of the code, specifically how <code>high</code> is initialised (<code>high = length-1</code> versus <code>high = length</code>) and how it's updated (<code>high = mid-1</code> versus <code>high = mid</code>). <hr> Which one is better? The main difference is that <code>mid = (low + high) / 2</code> will be slightly different for each case. More specifically, <code>high</code> will be 1 bigger in the exclusive case, thus, when <code>high-low</code> is even in the inclusive case, <code>mid</code> will stay the same, but when <code>high-low</code> is odd in the inclusive case, <code>mid</code> will be 1 element bigger in the exclusive case (this is because of rounding). Let's consider an example: <pre class="prettyprint"><code>length = 6 low = 0 highInclusive = 5, highExclusive = 6 midInclusive = 5/2 = 2, midExclusive = 6/2 = 3 </code></pre> As you can see, when there is no single middle element, one will pick the element to the left and the other will pick the element to the right. While this will sometimes make the one faster and sometimes make the other faster, the average running time will be pretty much identical. From a readability perspective, it might be slightly better (in my opinion) to use the exclusive one in languages with 0-based arrays and either one in languages with 1-based arrays, in order to minimise the number of <code>-1</code>'s in the code. An argument could also be made to just stick to a single version in all languages, as to not require that people understand both versions or get confused between the two.

Binary Search using start < end vs. using start <= end

Q: How does a binary search end?

It should be terminated when the search interval is empty, which means that if you don't have to find it, it means you haven't found it.

Q: How do you avoid infinite loops in binary search?

To ensure binary search termination, make sure the mid (and so left or right) change on every iteration. If we can reason or determine that mid value will always change, we can be 100% confidence that the binary search will not be struck in the infinite loop.

Q: Which of the following is terminating condition of failure for binary search?

Running loop with high initialized with Length.

Tags:

java

algorithm

binary-search

In binary search, we usually have low and high variables and typically there is a while loop that tests if low <= high, as shown in this code (from Wikipedia):

Click to copy

int SortedArray[max] = {....}

int BinarySearch (int key)
{
    int start = 0;
    int end = max - 1;
    int mid;
    while (start <= end)
    {
        mid = (start + end) / 2;
        if (key == a[mid])
            return mid;
        else if (key < a[mid])
            end = mid - 1;
        else start = mid + 1;
    }
    return -1;
}

When learning binary search, I was always taught the start <= end approach, but when seeing other implementations, I've seen a lot of people do while(start < end).

Is there an advantage to one versus the other? In my own native implementations, I do the <= approach but when I switch it out for <, the search fails.

Is there a rule of thumb for using one versus the other?

933

asked May 28 '17 19:05

javanewbie

2 Answers

even if your question is probably not super clear, I could infer you are talking about this kind of implementation of the binary search (here in C, from Wikipedia):

Click to copy

int SortedArray[max] = {....}

int BinarySearch (int key)
{
    int start = 0;
    int end = max - 1;
    int mid;
    while (start <= end)
    {
        mid = (start + end) / 2;
        if (key == a[mid])
            return mid;
        else if (key < a[mid])
            end = mid - 1;
        else start = mid + 1;
    }
    return -1;
}

If you replace start <= end by start < end, there will be cases where your algorithm will not give the good answer.

Let's think about two cases.

1 - You would like to search 1 in the list [1]. In that case, start = 0, end = 0 and the algorithm would return -1 if you change the loop condition.

2 - You would like to search 2 in the list [1, 2]. In that case, start = 0, end = 1. The algorithm will set mid = (0+1)/2=0 in C. Thus arr[mid] < key. It will make start = 1, end = 1. Again, if you stop the loop here, the algorithm will return -1 instead of 1.

And there are probably many other examples.

Have a nice day

172

answered Sep 24 '22 01:09

Alexis Clarembeau

For low <= high, high is considered inclusive (high is part of the range we consider).

For low < high, high is considered exclusive (high is not part of the range we consider).

Both can be correct, but there will be minor differences in the rest of the code, specifically how high is initialised (high = length-1 versus high = length) and how it's updated (high = mid-1 versus high = mid).

Which one is better?

The main difference is that mid = (low + high) / 2 will be slightly different for each case.

More specifically, high will be 1 bigger in the exclusive case, thus, when high-low is even in the inclusive case, mid will stay the same, but when high-low is odd in the inclusive case, mid will be 1 element bigger in the exclusive case (this is because of rounding).

Let's consider an example:

Click to copy

length = 6
low = 0
highInclusive = 5, highExclusive = 6
midInclusive = 5/2 = 2, midExclusive = 6/2 = 3

As you can see, when there is no single middle element, one will pick the element to the left and the other will pick the element to the right.

While this will sometimes make the one faster and sometimes make the other faster, the average running time will be pretty much identical.

From a readability perspective, it might be slightly better (in my opinion) to use the exclusive one in languages with 0-based arrays and either one in languages with 1-based arrays, in order to minimise the number of -1's in the code. An argument could also be made to just stick to a single version in all languages, as to not require that people understand both versions or get confused between the two.

answered Sep 24 '22 01:09

Bernhard Barker

Related questions
                            
                                Does Stream.parallel() use a new thread?
                            
                                Android Studio stuck on loading screen
                            
                                Maven, javadoc : No source files for package
                            
                                Is there a way to prevent bean overriding with Spring Boot?
                            
                                How do I change the border lines COLOUR on the table view
                            
                                How to add Toolbar in PreferenceActivity
                            
                                Get third friday of a month
                            
                                How to get a message from a lettuce RedisPubSubListener in Java?
                            
                                Why does the Java ArrayList class return a boolean for add?
                            
                                Optional ifPresent Return another type [duplicate]
                            
                                How to get last day of the month for the given date [duplicate]
                            
                                Decorator Pattern : Why do we need an abstract decorator?
                            
                                How to execute a procedure with JDBC
                            
                                "Table name pattern can not be NULL or empty" in java
                            
                                what does IN_NATIVE mean in jstack file ?
                            
                                How to put the ArrayList into bundle
                            
                                Best practices or principles for sharing objects between threads in Java
                            
                                Apache Beam MinimalWordcount example with Dataflow Runner on eclipse
                            
                                how to add image at the center of LayerList android
                            
                                An internal error occurred during: "Retrieving archetypes:". Java heap space when creating new Maven project

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Binary Search using start < end vs. using start <= end

Tags:

java

algorithm

binary-search

javanewbie

People also ask

2 Answers

Alexis Clarembeau

Bernhard Barker

Recent Activity

Donate For Us