I'm trying to understand the algorithm that can be used to calculate the area of the union of a set of axis aligned rectangles.
The solution that I'm following is here : http://tryalgo.org/en/geometry/2016/06/25/union-of-rectangles/
The part I don't understand is :
The segment tree is the right choice for this data structure. It has complexity O(logn) for the update operations and O(1) for the query. We need to augment the segment tree with a score per node, with the following properties.
- every node corresponds to a y-interval being the union of the elementary y-intervals over all the indices in the span of the node.
- if the node value is zero, the score is the sum of the scores of the descendants (or 0 if the node is a leaf).
- if the node value is positive, the score is the length of the y-interval corresponding to the node.
How do we achieve this in O(n log n) ?
My idea was to create a segment tree, and update each range's value as and when we encounter the range(y range as the height of the rectangle) while line sweeping. And then for for each interval(two consecutive elements in the sorted x array, multiple Δx by the total length of the y range active in this interval, by looking at the sum of all elements in the segment tree)
This would still leads us to having max(y) - min(y) elements in the segment tree's base.
Hence, I'm not sure how this is O(n log n) - where n is the number of rectangles.
Would greatly appreciate any help here.
Thanks!
Let's consider some easy case:
According to your understanding you would create segment tree with 11 - 1 = 10 nodes at base, so something like this:
Notice we have only 9 nodes in base, because first node is for interval [1,2], next one for interval [2,3] and so on
And when you enter some rectangle, you would update it's range based on its y coordinates, so after meeting first one on x=0, your segment tree would look like this:
We would also need to use something called lazy propagation to update active intervals on the tree, so all active intervals would contribute 1 to the sum.
So complexity of your current approach is something like O(K log K) where K = max(y)-min(y)
We can easilly reduce this to O(n log n) where n is number of rectangles.
Notice that only important y coordinates are those that exist, so in this example 1,3,6,11
Also notice that there's at most 2*n such coordinates
So we can map all coordinates to some integers so they fit better in segment tree.
This is known as coordinate compression it can be done with something like this:
So in our example it would be:
[1,3,6,11]
[1,3,6,11]
mp[1]=1, mp[3]=2, mp[6]=3, mp[11]=4
So now algorithm stays the same, yet we can use segment tree with only at most 2*n nodes in it's base.
Also we would need to modify our segment tree a little, instead of keeping which y coordinates are on or off we will now keep which intervals of y coordinates are on/off
So we will have nodes for intervals [y0,y1],[y1,y2], ... for all unique sorted values of y.
Also all nodes will contribute y[i]-y[i-1] to the sum (if they are in range and active) instead of one.
So our new segment tree would be something like this:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With