Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the smallest area rectangle covering a query point

I'm working on a personal project to do with computational geometry. The question in the title is an abstraction of one of the small subproblems that I am trying, but struggling, to solve efficiently. Hopefully it's general enough to maybe be of use of more than just me!


The problem

Imagine we have a set S of rectangles in the plane, all of which have edges parallel to the coordinate axes (no rotations). For my problem we'll assume that rectangle intersections are very common. But they are also very nice: If two rectangles intersect, we can assume one of them always completely contains the other. So there's no "partial" overlaps.

I want to store these rectangles in a way that:

  • We can efficiently add new rectangles.
  • Given a query point (x,y) we can efficiently report back the rectangle of smallest area that contains the point.

The illustration provides the motivation for the latter. We always want to find the most deeply nested rectangle that contains the query point, so that's always the one of smallest area.

.


My thoughts

So I know that both R-Trees and Quad-Trees are often used for spatial indexing problems, and indeed both can work well in some cases. The problem with R-Trees is that they can degrade to linear performance in the worst case.

I thought about building a set of balanced binary trees based on nestedness. The left subtree of node r contains all the rectangles that are inside rectangle r. The right subtree contains all the rectangles that r is inside of. The illustrated example would have three trees.

But what if none of the rectangles are nested? Then you need O(n) trees of 1 element and again we have something that performs just as poorly as a linear scan through the boxes.


How could I solve this in a way that we have asymptotically sub linear time in the worst case? Even if that means sacrificing some performance in the best cases or storage requirements. (I assume for a problem like this, there may be a need to maintain two data structures and that's cool)

I am certain that the very specific way in which rectangles are allowed to intersect should help make this problem possible. In fact, it looks like a candidate for logarithmic performance to me but I'm just not getting anywhere.

Thanks in advance for any ideas you have!

like image 838
Jay Avatar asked Mar 10 '23 11:03

Jay


2 Answers

I'd suggest storing the rectangles per nesting level, and tackling the rectangle-finding per level. Once you've found which top-level rectangle the point is in, you can then look at the second-level rectangles that are inside that rectangle, find the rectangle the point is in using the same method, then look at the third-level, and so on.

To avoid a worst-case of O(n) to find the rectangle, you could use a sort of ternary spatial tree, where you repeatedly draw a vertical line across the space and divide the rectangles into three groups: those to the left (blue), those intersected by (red), and those to the right (green) of the line. For the group of intersected rectangles (or once a vertical line would intersect most or all of the rectangles), you switch to a horizontal line and divide the rectangles into groups above, intersected by, and below the line.

ternary spatial tree

You would then repeatedly check whether the point is to the left/right or above/below the line, and go on to check the rectangles on the same side and those intersected by the line.

In the example, only four rectangles would actually need to be checked to find which rectangle contains the point.


If we use the following numbering for the rectangles in the example:

rectangle numbering

then the ternary spatial tree would be something like this:

ternary spatial tree

like image 159

You can partition the area from xMin to xMax and yMin to yMax along the edges of the rectangles. This gives at most (2n - 1)^2 fields. Each of the fields is either completely empty or occupied by the visible (part of a) single rectangle. Now you can easily create a tree structure with links to the top rectangle (e.g. count the number of partitions in x and y direction, where there are more divide in the middle and create a node... proceed recursively). So the lookup will take O(log n^2) which is sub linear. And the data structure takes O(n^2) space.

Here is a better implementation in terms of complexity, because the search of the indices can be separated the search for the rectangle on top is only O(log n) no matter how the configuration of the rectangles is and fairly simple to implement:

private int[] x, y;
private Rectangle[][] r;

public RectangleFinder(Rectangle[] rectangles) {
    Set<Integer> xPartition = new HashSet<>(), yPartition = new HashSet<>();
    for (int i = 0; i < rectangles.length; i++) {
        xPartition.add(rectangles[i].getX());
        yPartition.add(rectangles[i].getY());
        xPartition.add(rectangles[i].getX() + rectangles[i].getWidth());
        yPartition.add(rectangles[i].getY() + rectangles[i].getHeight());
    }
    x = new int[xPartition.size()];
    y = new int[yPartition.size()];
    r = new Rectangle[x.length][y.length];
    int c = 0;
    for (Iterator<Integer> itr = xPartition.iterator(); itr.hasNext();)
        x[c++] = itr.next();
    c = 0;
    for (Iterator<Integer> itr = yPartition.iterator(); itr.hasNext();)
        y[c++] = itr.next();
    Arrays.sort(x);
    Arrays.sort(y);
    for (int i = 0; i < x.length; i++)
        for (int j = 0; j < y.length; j++)
            r[i][j] = rectangleOnTop(x[i], y[j]);
}

public Rectangle find(int x, int y) {
    return r[getIndex(x, this.x, 0, this.x.length)][getIndex(y, this.y, 0, this.y.length)];
}

private int getIndex(int n, int[] arr, int start, int len) {
    if (len <= 1)
        return start;
    int mid = start + len / 2;
    if (n < arr[mid])
        return getIndex(n, arr, start, len / 2);
    else
        return getIndex(n, arr, mid, len - len / 2);
}
like image 24
maraca Avatar answered Mar 12 '23 01:03

maraca