This is an interview question, the interview has been done.
Given a deck of rectangular cards, put them randomly on a rectangular table whose size is much larger than the total sum of cards' size. Some cards may overlap with each other randomly. Design an algorithm that can calculate the area the table covered by all cards and also analyze the time complexity of the algorithm. All coordinates of each vertex of all cards are known. The cards can overlap in any patterns.
My idea:
Sort the cards by its vertical coordinate descending order.
Scan the cards vertically from top to bottom after reaching an edge or vertices of a card, go on scanning with another scan line until it reached another edge, and find the area located between the two lines . Finally, sum all area located between two lines and get the result.
But, how to compute the area located between two lines is a problem if the area is irregular.
Any help is appreciated. thanks !
This could be done easily using the union-intersection formula (size of A union B union C = A + B + C - AB - AC - BC + ABC, etc), but that would result in an O(n!)
algorithm. There is another, more complicated way that results in O(n^2 (log n)^2)
.
Store each card as a polygon + its area in a list. Compare each polygon in the list to every other polygon. If they intersect, remove them both from the list, and add their union to the list. Continue until no polygons intersect. Sum their areas to find the total area.
The polygons can be concave and have holes, so computing their intersection is not easy. However, there are algorithms (and libraries) available to compute it in O(k log k)
, where k
is the number of vertices. Since the number of vertices can be on the order of n
, this means computing the intersection is O(n log n)
.
Comparing every polygon to every other polygon is O(n^2)
. However, we can use an O(n log n)
sweeping algorithm to find nearest polygons instead, making the overall algorithm O((n log n)^2) = O(n^2 (log n)^2)
.
This is almost certainly not what your interviewers were looking for, but I'd've proposed it just to see what they said in response:
I'm assuming that all cards are the same size and are strictly rectangular with no holes, but that they are placed randomly in an X,Y sense and also oriented randomly in a theta sense. Therefore, each card is characterized by a triple (x,y,theta) or of course you also have your quad of corner locations. With this information, we can do a monte carlo analysis fairly simply.
Simply generate a number of points at random on the surface of the table, and determine, by using the list, whether or not each point is covered by any card. If yes, keep it; if not, throw it out. Calculate the area of the cards by the ratio of kept points to total points.
Obviously, you can test each point in O(n) where n is the number of cards. However, there is a slick little technique that I think applies here, and I think will speed things up. You can grid out your table top with an appropriate grid size (related to the size of the cards) and pre-process the cards to figure out which grids they could possibly be in. (You can over-estimate by pre-processing the cards as though they were circular disks with a diameter going between opposite corners.) Now build up a hash table with the keys as grid locations and the contents of each being any possible card that could possibly overlap that grid. (Cards will appear in multiple grids.)
Now every time you need to include or exclude a point, you don't need to check each card, but only the pre-processed cards that could possibly be in your point's grid location.
There's a lot to be said for this method:
On the other hand:
I wish I could take credit for this idea, but alas, I picked it up from a paper calculating surface areas of proteins based on the position and sizes of the atoms in the proteins. (Same basic idea, except now we had a 3D grid in 3-space, and the cards really were disks. We'd go through and for each atom, generate a bunch of points on its surface and see if they were or were not interior to any other atoms.)
EDIT: It occurs to me that the original problem stipulates that the total table area is much larger than the total card area. In this case, an appropriate grid size means that a majority of the grids must be unoccupied. You can also pre-process grid locations, once your hash table is built up, and eliminate those entirely, only generating points inside possibly occupied grid locations. (Basically, perform individual MC estimates on each potentially occluded grid location.)
Here's an idea that is not perfect but is practically useful. You design an algorithm that depends on an accuracy measure epsilon (eps). Imagine you split the space into squares of size eps x eps. Now you want to count the number of squares lying inside the cards. Let the number of cards be n and let the sides of the cards be h and w.
Here is a naive way to do it:
S = {} // Hashset
for every card:
for x in [min x value of card, max x value of card] step eps:
for y in [min y value of card, max y value of card] step eps:
if (x, y) is in the card:
S.add((x, y))
return size(S) * eps * eps
The algorithm runs in O(n * (S/eps)^2) and the error is strongly bounded by (2 * S * n * eps), therefore the relative error is at most (2 * eps * n / S).
So for example, to guarantee an error of less than 1%, you have to choose eps less than S / (200 n) and the algorithm runs in about 200^2 * n^3 steps.
Suppose there are n cards of unit area. Let T be the area of the table. For the discretised problem, the expected area covered will be
$ T(1-({{T-1}\over{T}})^n) $
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With