Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java collection binarySearch not working properly

I am just trying to use native Java binarySearch hoping it can always find the first occurrence. But it's not always return the first occurrence, what have I done wrong here ?

import java.util.*;

class BinarySearchWithComparator
{
  public static void main(String[] args)
  {
    // Please scroll down to see 'User' class implementation.
    List<User> l = new ArrayList<User>();
    l.add(new User(10, "A"));
    l.add(new User(10, "A"));
    l.add(new User(10, "A"));

    l.add(new User(20, "B"));
    l.add(new User(20, "B"));
    l.add(new User(20, "B"));
    l.add(new User(20, "B"));
    l.add(new User(20, "B"));
    l.add(new User(20, "B"));
    l.add(new User(20, "B"));


    l.add(new User(30, "C"));
    l.add(new User(30, "C"));
    l.add(new User(30, "C"));
    l.add(new User(30, "C"));

    Comparator<User> c = new Comparator<User>() {
      public int compare(User u1, User u2) {
        return u1.getId().compareTo(u2.getId());
      }
    };

    // Must pass in an object of type 'User' as the key.
    // The key is an 'User' with the 'id' which is been searched for.
    // The 'name' field is not used in the comparison for the binary search,
    // so it can be a dummy value -- here it is omitted with a null.
    //
    // Also note that the List must be sorted before running binarySearch,
    // in this case, the list is already sorted.
    Collections.sort(l,c);

    int index = Collections.binarySearch(l, new User(10, null), c);
    System.out.println(index);

    index = Collections.binarySearch(l, new User(20, null), c);
    System.out.println(index);

    index = Collections.binarySearch(l, new User(30, null), c);
    System.out.println(index);

    index = Collections.binarySearch(l, new User(42, null), c);
    System.out.println(index);
  }
}

class User {
  private int id;
  private String name;

  public User(int id, String name) {
    this.id = id;
    this.name = name;
  }

  public Integer getId() {
    return Integer.valueOf(id);
  }
}

========

Result:
2 //not 0 ?
6 //not 3?
10 //ok
-15  ok

Why first 10 is not index 0 ?
Why first 20 is not index 3 ?
OK , 30 first occurrence is index 10

======================================

Update

Now it seems it's the API does not not gaurante that ! Can anyone give me working example of how to find the 1st occurrence and last occurrence for a given element (say User(10,null)?

Thanks a lot.

like image 910
RoundPi Avatar asked Mar 25 '13 01:03

RoundPi


People also ask

Why binary search is not working?

You are going to into infinite recursion. The base case of binary search is when hi is less than lo . You also want to change your left call to go from low to m-1 rather than pivot because your bounds are inclusive on both ends.

What does collections binarySearch return if not found?

binarySearch. Searches the specified list for the specified object using the binary search algorithm. The list must be sorted into ascending order according to the specified comparator (as by the sort(List, Comparator) method), prior to making this call. If it is not sorted, the results are undefined.

What does Java's Arrays binarySearch() method return?

This method returns index of the search key, if it is contained in the array, else it returns (-(insertion point) - 1). The insertion point is the point at which the key would be inserted into the array: the index of the first element greater than the key, or a.


4 Answers

Java does not make a guarantee as to which element among the equal ones it is going to return. When you have multiple elements in an equal range, you need to walk the list down to find the first element that's equal to what you are looking for, like this:

User lookFor = new User(30, null);
index = Collections.binarySearch(l, lookFor, c);
while (index > 0 && c.compare(lookFor, l[index-1]) == 0) {
    index--;
}
// At this point the index is at the first element of the equal range

You can wrap this logic in a static function to avoid writing an explicit loop each time.

Note that if your collection is random access, this will cause worst-case performance (when there are many identical elements) to degrade from O(lg(n)) to O(n).

like image 154
Sergey Kalinichenko Avatar answered Oct 04 '22 03:10

Sergey Kalinichenko


but you have more than 1 element with id 10 on your list. So binarySearch isn't wrong

Java manual for binarySearch says this:

Searches the specified list for the specified object using the binary search algorithm. The list must be sorted into ascending order according to the natural ordering of its elements (as by the sort(List) method) prior to making this call. If it is not sorted, the results are undefined. If the list contains multiple elements equal to the specified object, there is no guarantee which one will be found.

http://docs.oracle.com/javase/6/docs/api/java/util/Collections.html#binarySearch(java.util.List, T)

like image 23
gerrytan Avatar answered Oct 04 '22 03:10

gerrytan


from the javadoc on Array.binarySearch():

Searches a range of the specified array of bytes for the specified value using the
binary search algorithm. The range must be sorted (as by the sort(byte[], int, int) 
method) prior to making this call. If it is not sorted, the results are undefined. 
If the range contains multiple elements with the specified value, there is no 
guarantee which one will be found.
like image 43
arcy Avatar answered Oct 04 '22 05:10

arcy


Many of the items in your list are equal, for sorting purposes. If two items have the same ID, then it doesn't really matter which one you use, from a sorting perspective. Collections.binarySearch just splits the list in half until it finds a match. Once it finds a match, it is not going to check the previous item to see if it's also a match, it will just return the index.

Consider a more robust compareTo implementation, which sorts by more than just the ID, if items will have IDs.

like image 35
Sam Barnum Avatar answered Oct 04 '22 05:10

Sam Barnum