Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement a set?

I want to implement a Set in C. Is it OK to use a linked list, when creating the SET, or should I use another approach ?

How do you usually implement your own set (if needed).

NOTE: If I use the Linked List approach, I will probably have the following complexities for Set my operations:

  • init : O(1);
  • destroy: O(n);
  • insert: O(n);
  • remove: O(n);
  • union: O(n*m);
  • intersection: O(n*m);
  • difference: O(n*m);
  • ismember: O(n);
  • issubset: O(n*m);
  • setisequal: O(n*m);

O(n*m) seems may be a little to big especially for huge data... Is there a way to implement my Set more efficient ?

like image 805
Andrei Ciobanu Avatar asked Mar 29 '10 12:03

Andrei Ciobanu


People also ask

How are sets implemented?

Sets can be implemented using various data structures, which provide different time and space trade-offs for various operations. Some implementations are designed to improve the efficiency of very specialized operations, such as nearest or union .

How are sets implemented in Java?

The Java platform contains three general-purpose Set implementations: HashSet , TreeSet , and LinkedHashSet . HashSet , which stores its elements in a hash table, is the best-performing implementation; however it makes no guarantees concerning the order of iteration.

How is a set implemented in C++?

Sets are implemented using a binary search tree. Sets are traversed using iterators.

How is set implemented internally in Python?

Creating Set:- In Python, Sets are created through set() function. An Empty list is created. Note that empty Set cannot be created through {}, it creates dictionary. Checking if an item is in : Time complexity of this operation is O(1) on average.


2 Answers

Sets are typically implemented either as red-black trees (which requires the elements to have a total order), or as an automatically-resizing hashtable (which requires a hash function).

The latter is typically implemented by having the hashtable double in size and reinserting all elements when a certain capacity threshold (75% works well) is exceeded. This means that inidividual insert operations can be O(n), but when amortized over many operations, it's actually O(1).

like image 119
Michael Borgwardt Avatar answered Sep 20 '22 01:09

Michael Borgwardt


std::set is often implemented as a red black tree: http://en.wikipedia.org/wiki/Red-black_tree

This approach will give you much better complexity on all the listed operations.

like image 28
Andreas Brinck Avatar answered Sep 22 '22 01:09

Andreas Brinck