Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The intersection of all combinations of n sets

I need help finding an efficient algorithm to solve this problem:

Given n unsorted sets of integers, find all possible combinations of n and their intersections. For example:

Input (n=3):

Set 1 = 1, 10, 6, 11, 14, 3
Set 2 = 3, 7, 11, 9, 5
Set 3 = 11, 6, 9, 1, 4

Output:

Set 1 & 2: 3, 11
Set 1 & 3: 1, 6, 11
Set 2 & 3: 9, 11
Set 1, 2, & 3: 11

I was thinking of first finding all possible combinations of sets and then using an algorithm to find the intersection of n sets found here: Intersection of n sets. I'm worried about the time efficiency of this approach, though.

If you can find something better than my naive approach, an answer in pseudo-code would be most helpful.

like image 631
Dan Chrostowski Avatar asked Sep 24 '14 23:09

Dan Chrostowski


1 Answers

Here is a solution inspired by MapReduce: Simplified Data Processing on Large Clusters, which can therefore be written in a distributed way if you want.

Map all of your elements in sets to pairs [element, set]. Group this list by element. You can just sort and pull out elements. Or you can create a hash of arrays whose keys are the elements and values are the list of sets that element is found in. Then for each element, emit a list of [combination of sets, element]. Group that by combination. And now for each non-empty combination, you can just read off the elements in it.

If you wish to distribute the computation with a real map-reduce, the first map would map to a key of element, and value of set. The first reduce would just store by element the list of sets it is in. The second map would emit for each element one key for each combination of sets it is in, with the element as the value. And the second reduce would store your final answer.

It may help to work your example in detail. You start with:

Set 1 = 1, 10, 6, 11, 14, 3
Set 2 = 3, 7, 11, 9, 5
Set 3 = 11, 6, 9, 1, 4

You map that to the list:

[1, Set 1], [10, Set 1], [6, Set 1], [11, Set 1], [14, Set 1], [3, Set 1],
[3, Set 2], [7, Set 2], [11, Set 2], [9, Set 2], [5, Set 2],
[11, Set 3], [6, Set 3], [9, Set 3], [1, Set 3], [4, Set 3],

Now group by element (I did it by hand by sorting) to get:

1: Set 1, Set 3
3: Set 1, Set 2
4: Set 3
5: Set 2
6: Set 1, Set 3
7: Set 2
9: Set 2, Set 3
10: Set 1
11: Set 1, Set 2, Set 3
14: Set 1

And now we do the second mapping (skipping the elements that are only in one set) to get:

[(Set 1, Set 3), 1],
[(Set 1, Set 2), 3],
[(Set 1, Set 3), 6],
[(Set 2, Set 3), 9],
[(Set 1, Set 2), 11],
[(Set 1, Set 3), 11],
[(Set 2, Set 3), 11],
[(Set 1, Set 2, Set 3), 11]

Group that by combination of sets and we get:

(Set 1, Set 2): 3, 11
(Set 1, Set 3): 1, 6, 11
(Set 2, Set 3): 9, 11
(Set 1, Set 2, Set 3): 11
like image 117
btilly Avatar answered Sep 22 '22 23:09

btilly