Collection of sets containing no sets which are a subset of another in the collection

Tags:

I am looking for an abstract data structure which represents a collection of sets such that no set in the collection is a subset of another set in the collection.

This means that on insert the following conditions will be met:

A. Inserting an element that is already a subset of another element will return the original collection.

B. Inserting an element that is a superset of any other elements will result in a collection with the superset added and the subsets removed.

Assuming an ordering on the elements of the set, then a prefix tree can be used to represent the collection. This permits condition A to be handled very quickly (ie it takes no longer to check the condition than it would to insert the subset) however meeting condition B takes time.

I am wondering if there is data structure that allows B to be met quickly as well.

455

asked Nov 15 '09 09:11

Mark Wassell

1 Answers

The trivial approach would be to keep a list of sets and perform a linear search through that for every incoming set (testing if the incoming is a subset).

This obviously runs in O(n) time for the linear search and possibly O(m) size for the size of the incoming set. Thus O(n*m) total time (number of sets vs. size of each set).

The most obvious optimization, of course, is to index on set sizes. Then you only test each incoming set against those which are of equal or larger size. (A set cannot be a subset of any smaller set, duh!).

The next optimization that comes to mind is to create in index of elements. Thus for each incoming set you'd find the intersection of each sets containing each of the elements. In other words if, for incoming set {a,b,c}, we find that element {a} exists in sets A, B, and D, element {b} exists in B, E, and F, and {c} exists in A, B, and Z ... then the incoming set is a subset of B (the intersection of {A, B, D}, {B, E, F}, and {A, B, Z}).

So, that sounds like O(m*log(n)) complexity to me. (We have to perform hashed searches on each element of each incoming set). Insertions should also be on the same order (inserting the new set's ID into each of the element's maps). (In Big-O analysis 2*O(mlog(n)) reduces down to O(mlog(n)), of course).

114

answered Oct 20 '22 21:10

Jim Dennis

Related questions
                            
                                Data Structure for representing patterns in strings
                            
                                Is there a formalism for this data structure?
                            
                                Deletion in Left Leaning Red Black Trees
                            
                                Indexable data structures behind Scala's for comprehension
                            
                                Bidirectional multimap equivalent data structure
                            
                                Why are Python lists implemented as dynamic arrays instead of ring buffers?
                            
                                JavaScript Array, Stack, Queue - what is the motivation behind this specific API design?
                            
                                Advice on what methodology/data structure/algorithm to use
                            
                                Representing and performing IOs on graphs and subgraphs
                            
                                Hashes: Tables, Lists and Maps, Oh My?
                            
                                Stable merging two arrays to maximize product of adjacent elements
                            
                                Adobe Interview: What data structure to use for storing thousands of points (x,y) to perform certain operations faster
                            
                                C Union in C# Error incorrectly aligned or overlapped by a non-object field
                            
                                Checking if several sets of pairs covers a given set of pairs
                            
                                Get array from list without heap allocation
                            
                                Significance of the term "Radix" in Radix Tree
                            
                                Why is it impossible to instantiate a data structure due to "overflow while adding drop-check rules"? [duplicate]
                            
                                partial lookup in key-value map where key itself is a key-value map
                            
                                Efficiently insert multiple elements in a list (or another data structure) keeping their order

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Collection of sets containing no sets which are a subset of another in the collection

Tags:

data-structures

set

subset

Mark Wassell

People also ask

1 Answers

Jim Dennis

Recent Activity

Donate For Us