[Please let me know if this maps to a known problem]
I have n sets of varying sizes. Each element in a set is unique. And each element can occur atmost in two different sets.
I want to perform an operation on these sets but avoid duplicates or missing any element. Problem: Find out which all of these n sets should be removed because they are covered by other sets.
E.g. [a,b,c]; [a]; [b]. Remove [a], [b] since both are covered by the first one.
E.g. [a,b,c]; [a]; [b]; [c,d]. Remove [a,b,c] since all three elements are covered by remaining sets. Note: here [a],[b] alone is not valid answer since 'c' is being duplicated. Similarly [a],[b],[c,d] is not valid answer since 'd' will be missed if removed.
I think that this is the Exact Cover problem. The last constraint—that each element is in at most two sets—doesn't seem to me to fundamentally change the problem (although I could easily be wrong about this). The Wikipedia web page contains a good summary of various algorithmic approaches. The algorithm of choice seems to be Dancing Links.
I think this is a case of a 2-sat problem that can be solved in linear time using a method based on Tarjan's algorithm.
You can now solve this using a standard 2-sat algorithm to find whether this is impossible to achieve or a satisfying assignment if it is possible.
For a case with V sets and N elements you will have V variables and up to 2N clauses, so Tarjan's algorithm will have complexity O(V+2N).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With