Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging overlapping axis-aligned rectangles

I have a set of axis aligned rectangles. When two rectangles overlap (partly or completely), they shall be merged into their common bounding box. This process works recursively.

Detecting all overlaps and using union-find to form groups, which you merge in the end will not work, because the merging of two rectangles covers a larger area and can create new overlaps. (In the figure below, after the two overlapping rectangles have been merged, a new overlap appears.)

enter image description here

As in my case the number of rectangles is moderate (say N<100), a brute force solution is usable (try all pairs and if an overlap is found, merge and restart from the beginning). Anyway I would like to reduce the complexity, which is probably O(N³) in the worst case.

Any suggestion how to improve this ?

like image 367
Yves Daoust Avatar asked Feb 01 '18 07:02

Yves Daoust


1 Answers

I think an R-Tree will do the job here. An R-Tree indexes rectangular regions and lets you insert, delete and query (e.g, intersect queries) in O(log n) for "normal" queries in low dimensions.

The idea is to process your rectangles successively, for each rectangle you do the following:

  1. perform an intersect query on the current R-Tree (empty in the beginning)

  2. If there are results then delete the results from the R-Tree, merge the current rectangle with all result rectangles and insert the newly merged rectangle (for the last step jump to step 1.).

  3. If there are no results just insert the rectangle in the R-Tree

In total you will perform

  • O(n) intersect queries in step 1. (O(n log n))
  • O(n) insert steps in step 3. (O(n log n))
  • at most n delete and n insert steps in step 2. This is because each time you perform step 2 you decrease the number of rectangles by at least 1 (O(n log n))

In theory you should get away with O(n log n), however the merging steps in the end (with large rectangles) might have low selectivity and need more than O(log n), but depending on the data distribution this should not ruin the overall runtime.

like image 95
SaiBot Avatar answered Oct 17 '22 22:10

SaiBot