Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the correct java data structure and search algorithm to search a set of ip ranges for inclusion

Tags:

java

Given the following list if IP address ranges.

LOW           HIGH
192.168.10.34 192.168.11.200
200.50.1.1    200.50.2.2

Assuming that iterating all the addresses within all the ranges and storing them in a HashSet will use too much memory.

So we need to maintain the ranges, which are actually just shifted numbers anyway (low and high). How would one store the ranges in a tree and then use an efficient search to find whether or not a particular address is within the range(s).

I have looked at RangeTree, but it seems to search for points on a cartesian plane, and doesn't seem suited to my use case. KDTree allows one to search nearest neighbour, and when k==2 it's basically the same as the range tree.

I would want all the matches, as there might be multiple hits.

What would work here? I am very sure this is a solved problem.

like image 864
Jim Avatar asked Oct 15 '22 05:10

Jim


2 Answers

A standard TreeMap will do, assuming you have no overlapping ranges.

Simply create a TreeMap, keyed by the lower end of the range, with the full range as the value. Locate the range using floorEntry(K key) and verify the value is in the range.

Example using simple Integer instead of IpAddress object, but any Comparable object can be used as range boundaries.

public final class RangeMap<T extends Comparable<T>> {
    private TreeMap<T, Range<T>> map = new TreeMap<>();
    public void add(Range<T> range) {
        Entry<T, Range<T>> entry = this.map.floorEntry(range.getUpperInclusive());
        if (entry != null && entry.getValue().overlaps(range))
            throw new IllegalArgumentException("Overlap: " + range + " vs " + entry.getValue());
        entry = this.map.ceilingEntry(range.getLowerInclusive());
        if (entry != null && entry.getValue().overlaps(range))
            throw new IllegalArgumentException("Overlap: " + range + " vs " + entry.getValue());
        this.map.put(range.getLowerInclusive(), range);
    }
    public boolean contains(T value) {
        Entry<T, Range<T>> entry = this.map.floorEntry(value);
        return (entry != null && entry.getValue().contains(value));
    }
    public Range<T> get(T value) {
        Entry<T, Range<T>> entry = this.map.floorEntry(value);
        return (entry != null && entry.getValue().contains(value) ? entry.getValue() : null);
    }
}
public final class Range<T extends Comparable<T>> {
    private final T lowerInclusive;
    private final T upperInclusive;
    public Range(T lowerInclusive, T upperInclusive) {
        this.lowerInclusive = lowerInclusive;
        this.upperInclusive = upperInclusive;
    }
    public T getLowerInclusive() {
        return this.lowerInclusive;
    }
    public T getUpperInclusive() {
        return this.upperInclusive;
    }
    public boolean contains(T value) {
        return (value.compareTo(this.lowerInclusive) >= 0 &&
                value.compareTo(this.upperInclusive) <= 0);
    }
    public boolean overlaps(Range<T> that) {
        return (this.lowerInclusive.compareTo(that.upperInclusive) <= 0 &&
                this.upperInclusive.compareTo(that.lowerInclusive) >= 0);
    }
    @Override
    public String toString() {
        return "(" + this.lowerInclusive + "-" + this.upperInclusive + ")";
    }
}

Test

RangeMap<Integer> map = new RangeMap<>();
map.add(new Range<>(50, 59));
map.add(new Range<>(70, 79));
for (int i = 40; i <= 90; i += 3)
    System.out.println(i + ": " + map.contains(i));

Output

40: false
43: false
46: false
49: false
52: true
55: true
58: true
61: false
64: false
67: false
70: true
73: true
76: true
79: true
82: false
85: false
88: false
like image 56
Andreas Avatar answered Oct 22 '22 02:10

Andreas


If I'm understanding this right, you have a list of ranges, and you want to put them into a structure so that it is quick to find which ranges a given value is within.

You could create a single Binary Search Tree that stores each value that is either end of a range (along with 0.0.0.0 probably) and loop forwards through that once maintaining a set of which ranges the current value is in and setting a copy of that as the data of each node upon reaching it, and then to find the ranges a given IP is within you'd simply use a Binary Search to find the highest bound value below that, and the set associated with that value would be the set of ranges that the given IP is within.

It'd be O(n^2) memory usage (and probably O(n^2) time to construct the structure, assuming that copying takes linear time based on the size of the data being copied, which I can't remember), but a single query would be O(logn) time.

Here's an image because I feel like this isn't a great explanation:

A graphical representation of the data structure

like image 1
The Zach Man Avatar answered Oct 22 '22 01:10

The Zach Man