Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Range lookup in Java

Suppose, I have an unsorted array of overlapped ranges. Each range is just a pair of integers begin and end. Now I want to find if a given key belongs to at least one of the ranges. Probably, I have to know the ranges it belongs as well.

We can assume the ranges array takes ~1M and fits the memory. I am looking for an easy algorithm, which uses only standard JDK collections without any 3d-party libraries and special data structures, but works reasonably fast.

What would you suggest?

like image 367
Michael Avatar asked Nov 18 '11 15:11

Michael


4 Answers

Sort the ranges numerically by a custom Comparator, then for each key k build a one-element range [k, k] and do a binary search for this range with a different Comparator.

The Comparator for searching's compare(x,y) should return

  • <0 if x.max < y.min
  • >0 if x.min > y.max
  • 0 otherwise (its two range arguments overlap).

As noted by @Per, you need a different, stricter Comparator for sorting, but the first two clauses still hold.

This should work even if the ranges overlap, though you may want to merge overlapping ranges after sorting to speed up the search. The merging can be done in O(N) time.

This is in effect a static interval tree, i.e. one without O(lg N) insertion or deletion, in the same way that a sorted array can be considered a static binary search tree.

like image 90
Fred Foo Avatar answered Oct 14 '22 19:10

Fred Foo


If you don't need to know which interval contains your point (EDIT: I guess you probably do, but I'll leave this answer for others with this question who don't), then

  1. Preprocess the intervals by computing two arrays B and E. B is the values of begin in sorted order. E is the values of end in sorted order.

  2. To query a point x, use binary search to find the least index i such that B[i] > x and the least index j such that E[j] ≥ x. The number of intervals [begin, end] containing x is i - j.


class Interval {
    double begin, end;
}

class BeginComparator implements java.util.Comparator<Interval> {
    public int compare(Interval o1, Interval o2) {
        return Double.compare(o1.begin, o2.begin);
    }
};

public class IntervalTree {
    IntervalTree(Interval[] intervals_) {
        intervals = intervals_.clone();
        java.util.Arrays.sort(intervals, new BeginComparator());
        maxEnd = new double[intervals.length];
        initializeMaxEnd(0, intervals.length);
    }

    double initializeMaxEnd(int a, int b) {
        if (a >= b) {
            return Double.NEGATIVE_INFINITY;
        }
        int m = (a + b) >>> 1;
        maxEnd[m] = initializeMaxEnd(a, m);
        return Math.max(Math.max(maxEnd[m], intervals[m].end), initializeMaxEnd(m + 1, b));
    }

    void findContainingIntervals(double x, int a, int b, java.util.Collection<Interval> result) {
        if (a >= b) {
            return;
        }
        int m = (a + b) >>> 1;
        Interval i = intervals[m];
        if (x < i.begin) {
            findContainingIntervals(x, a, m, result);
        } else {
            if (x <= i.end) {
                result.add(i);
            }
            if (maxEnd[m] >= x) {
                findContainingIntervals(x, a, m, result);
            }
            findContainingIntervals(x, m + 1, b, result);
        }
    }

    java.util.Collection<Interval> findContainingIntervals(double x) {
        java.util.Collection<Interval> result  = new java.util.ArrayList<Interval>();
        findContainingIntervals(x, 0, intervals.length, result);
        return result;
    }

    Interval[] intervals;
    double[] maxEnd;

    public static void main(String[] args) {
        java.util.Random r = new java.util.Random();
        Interval[] intervals = new Interval[10000];
        for (int j = 0; j < intervals.length; j++) {
            Interval i = new Interval();
            do {
                i.begin = r.nextDouble();
                i.end = r.nextDouble();
            } while (i.begin >= i.end);
            intervals[j] = i;
        }
        IntervalTree it = new IntervalTree(intervals);
        double x = r.nextDouble();
        java.util.Collection<Interval> result = it.findContainingIntervals(x);
        int count = 0;
        for (Interval i : intervals) {
            if (i.begin <= x && x <= i.end) {
                count++;
            }
        }
        System.out.println(result.size());
        System.out.println(count);
    }
}
like image 21
Per Avatar answered Oct 14 '22 18:10

Per


I believe this is what you are looking for: http://en.wikipedia.org/wiki/Interval_tree

But check this simpler solution first to see if it fits your needs: Using java map for range searches

like image 28
Paul Jackson Avatar answered Oct 14 '22 19:10

Paul Jackson


simple solution with O(n) complexity:

for(Range range: ranges){
  if (key >= range.start && key <= range.end)
    return range;
} 

More clever algorithm can be applied if we know more information about ranges. Is they sorted? Is they overlapped? and so on

like image 36
mishadoff Avatar answered Oct 14 '22 19:10

mishadoff