Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to find a subset of a set from a group of sets

First, sorry for the ambiguous title.

Assume I have the following group of sets:

Group 1

s1 = ( x1, y1 )
s2 = ( x2 )

Group 2

m1 = ( x1, y1, y2 )
m2 = ( x1 )
m3 = ( x1 , x2 )

For each of the sets in Group 1 - call the set s, I need to find the sets in Group 2 - call it m - such that m is a subset of s.

So, for my example, the answer would be:

s1 -> m2
s2 -> nothing

For now, I'm storing the values in std:set, but I can change that if needed. Also, the sets can get big, so the algorithm needs to be efficient. For now I have a brute-force approach, which I'm not entirely satisfied with.

Any suggestions?

like image 732
Luchian Grigore Avatar asked Feb 15 '12 17:02

Luchian Grigore


People also ask

How do you find the subsets of a set?

If a set contains n elements, then the number of subsets of this set is equal to 2ⁿ - 1 . The only subset which is not proper is the set itself. So, to get the number of proper subsets, you just need to subtract one from the total number of subsets.

How do you find the subsets of a set with 6 elements?

Since the number of subsets of a set are 2n if the set contains n elements, ∴ Number of subsets =26=64. Was this answer helpful?

How many subsets are there in a set a 1 2 3 4 5?

The given set {1, 2, 3, 4, 5} contains 5 elements. So, it has 25 = 32 subsets in all and 31 proper subsets.

How many subsets of one or more elements can be formed from a set containing 12 elements?

The set s has 12 elements, we need to find out the subsets of s have at most 4 elements. Therefore, the number of subsets 210+120+45+10+1=386.


1 Answers

The first step would be to sort Group 1 according to cardinality (i.e. size). Then the algorithm is something on the order of:

foreach std::set M in "Group 2" {
  foreach std::set S in "Group 1" and S.size()>=M.size() {  // replace with binary search
     if ( std::includes(S.begin(),S.end(),M.begin(),M.end()) )
       { /* M is a subset of S */ }
    }
  }
}

This should have time complexity ~O(MSR), where M is the # of sets in "Group 2", S the # of sets in "Group 1", and R is the size of largest set in "Group #1".

Edit: It just occurred to me that it might be more efficient to use S.find() rather than calling std::includes() (which iterates sequentially) but I think that would only be true if M.size() is much smaller than S.size() -- O(M+S) vs O(MlogS).

like image 72
mcmcc Avatar answered Oct 21 '22 06:10

mcmcc