I have a collection of int
values with which I populate a HashSet<int>
in the following manner -
var hashSet = new HashSet<int>(myIEnumerable);
Assuming that iterating the IEnumerable
is O(n)
, what will be the worst case complexity of creating a HashSet<int>
in such a way?
For HashSet, LinkedHashSet, and EnumSet, the add(), remove() and contains() operations cost constant O(1) time thanks to the internal HashMap implementation. Likewise, the TreeSet has O(log(n)) time complexity for the operations listed in the previous group. This is because of the TreeMap implementation.
On average, the contains() of HashSet runs in O(1) time. Getting the object's bucket location is a constant time operation. Taking into account possible collisions, the lookup time may rise to log(n) because the internal bucket structure is a TreeMap.
A HashSet is an optimized collection of unordered, unique elements that provides fast lookups and high-performance set operations. The HashSet class was first introduced in . NET 3.5 and is part of the System. Collection. Generic namespace.
You can bring the worst case to O(N^2)
by supplying objects that all hash to the same bucket when the set reaches its maximum size. For example, if you pass a sequence of 17519 int
s constructed as
x[i] = i * 17519
for i
between 1 and 17519, inclusive, all numbers will hash to the initial bucket on Microsoft's implementation of HashSet<int>
, taking O(N^2)
to insert:
var h = new HashSet<int>(Enumerable.Range(1, 17519).Select(i => i*17519));
Set a brea kpoint, and examine h
in the debugger. Look at Raw View / Non-public members / m_buckets. Observe that the initial bucket has 17519 elements, while the remaining 17518 all have zeros.
The documentation actually states:
This constructor is an O(n) operation, where n is the number of elements in the collection parameter.
http://msdn.microsoft.com/en-us/library/bb301504.aspx
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With