Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Time complexity of unordered_set<int> find method

What is the time complexity of find method in unordered_set<int>?

And also is it possible to change the hash functions ?

like image 987
navid mahdian Avatar asked Feb 27 '17 14:02

navid mahdian


People also ask

What does unordered_set find return?

The unordered_set::find() function is a built-in function in C++ STL which is used to search for an element in the container. It returns an iterator to the element, if found else, it returns an iterator pointing to unordered_set::end().

Is unordered_set fast?

Even though unordered set is faster in the average case, set is almost always guaranteed to have lower worst-case complexities (for example insert). If you want to access the elements in order, that set sorts them. Different sets can be lexicographically compared using <,<=, >, and >=.

What is the use of unordered_set in C++?

An unordered_set is an Associative container that contains an unordered set of data inserted randomly. Each element may occur only once, so duplicates are not allowed. A user can create an unordered set by inserting elements in any order and an unordered set will return data in any order i.e. unordered form.

What does unordered_set mean?

Unordered set is an associative container that contains a set of unique objects of type Key. Search, insertion, and removal have average constant-time complexity. Internally, the elements are not sorted in any particular order, but organized into buckets.


2 Answers

what is the time complexity of find method in unordered_set?

...it's right there in the page you linked:

Complexity:

Average case: constant.

Worst case: linear in container size.


and also it it possible to change the hash functions?

Yes. Again, look at the documentation!

std::unordered_map takes an Hash template parameter. It's a customization point where you can inject your own hashing logic. A custom Hash must satisfy the Hash concept.

like image 102
Vittorio Romeo Avatar answered Nov 02 '22 13:11

Vittorio Romeo


I guess you are getting confused by the default max_load_factor being 1. When you insert an int x in the unordered_set, it goes to the bucket i (i=x%number of buckets). So as you can imagine even if the hash function wont have collisions, as it maps each int with itself, the mod operation can have "collisions" in some cases. For example, if you insert 1, 4 and 6 in that order, both 1 and 6 will be in the same bucket, and the find function will need to go through the bucket to find them. The number of buckets is only increased when the load factor reaches the max load factor. The load factor is the arithmetic mean of the number of elements per bucket. So you can actually have more than one element in each bucket, and you can even have all elements in the same in the same bucket. In that case, finding an element that's inside the set would need a traditional sequential search (O(n)) inside the bucket. Here you have an example:

unordered_set<int> n;
n.insert(1);
n.insert(12);
n.insert(23);
n.insert(34);
n.insert(45);

In that case, every int is in the bucket 1, so when you look for 56 (56%11 = 1) you need to go through the whole bucket (size n, O(n)). The load factor is 0.4545 (5 elements / 11 buckets), so no buckets are added. You can reduce the max_load_factor (some languages use a load factor of 0.75), but that would increase the number of rehashes, as you would need to reserve buckets more frequently (the process of reserving is amortized constant, as it uses the same method std::vector uses, that's why in the example we have 11 buckets)

like image 21
ALEJANDRO PEREZ MORENO Avatar answered Nov 02 '22 13:11

ALEJANDRO PEREZ MORENO