Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a better algorithm than brute force to separate items in overlapping categories?

I have an arbitrary set of items (dots below), and some number of categories overlapping in an arbitrary way (A-C below). The test is to determine whether it's possible for each item to be assigned to a single category, among the ones it already belongs to, such that each category ends up with at least some number of items.

So for example, we may require that A, B, and C can each claim one item. If we're given all 4 dots below, showing that this is possible is easy: just stick all the items in a list, loop through the categories, and have each category remove an item that it has access too, and as long as each category is able to remove one item, we pass the test.

Venn diagram, with three circles and colored dots in the overlapping sections

Now suppose we remove the blue dot and repeat the test. Clearly we can assign yellow to A, red to B, and green to C, and again we pass. But it's hard to codify this solution: if we follow the previous method (again no blue dot), then suppose A starts by removing the green dot. Now if B removes the yellow dot, we'll fail the test (since C can't remove the red dot), whereas if B removes the red dot, C can still take yellow and pass.

One could solve this by brute force by iterating through every possible assignment of items to categories, checking to see if the condition is met with each iteration, but this won't scale well to arbitrary numbers of items and categories, and I have a feeling there's a smarter or more efficient way.

I don't know the name of this problem, which makes it hard to research. Does it have an optimal solution? If so, what kind of complexity can I expect for the solution?

like image 772
Shep Avatar asked Aug 23 '13 20:08

Shep


1 Answers

You were right to point out that it's an assignment problem, and as such, it can be solved using Graph Theory classic algorithms.

You can translate your problem as follows:

enter image description here

The nodes on one side represent the categories and the nodes on the other side represent the items. You want to find a maximal match. This problem can be reduced to finding maximal flow in a bi-partite graph.

Fold-Fulkerson can be used to find a maximum matching in a bipartite graph in O(ES) where E is the number of edges and S is the maximal flow in the network.

like image 163
Nir Alfasi Avatar answered Oct 11 '22 12:10

Nir Alfasi