<code>std::unordered_map</code> guarantees O(1) time search, but how does it manage collision? Cppreference claims <blockquote> Unordered map is an associative container that contains key-value pairs with unique keys. Search, insertion, and removal of elements have average constant-time complexity. </blockquote> Assuming a situation where all the Hash codes are same, how is the collision handled internally? My assumption would be totally wrong if the hash code is unique to every key. In that case how is the unique hash code created where there are no collisions at all? What approach does <code>std::unordered_map</code>'s hash function take to guarantee O(1) search?

It doesn't guarantee O(1), it's O(1) on average... Worst case it can be O(n) when there are a lot of collisions. Please see link below, for more info: https://stackoverflow.com/a/2771398/5874704 Update Since the question has been edited, and now asks specifically about collisions for <code>std::unordered_map</code>, please have a look at the following answer: https://stackoverflow.com/a/21519560/5874704 <blockquote> I think we can conclude that all practical implementations of std::unordered_set (or unordered_map) almost certainly use collision chaining. While it might be (just barely) possible to meet the requirements using linear probing or double hashing, such an implementation seems to lose a great deal and gain nearly nothing in return. </blockquote>

There was an omission from your post that is crucial to understand: <code>std::unordered_map</code> has average-case <code>O(1)</code> search. It can take up to <code>O(n)</code> in the number of elements in the map to retrieve the element. As for which hash function it uses - this is up to the user. By default it uses <code>std::hash</code>. The only requirement on the hashing function with respect to collision handling is <blockquote> Hash functions are only required to produce the same result for the same input within a single execution of a program; this allows salted hashes that prevent collision DoS attacks. (cppreference) </blockquote>

<blockquote> std::unordered_map guarantees O(1) time search, but how does it manage collision? </blockquote> It uses open addressing / separate chaining, see here. <blockquote> Cppreference claims Unordered map is an associative container that contains key-value pairs with unique keys. Search, insertion, and removal of elements have average constant-time complexity. Assuming a situation where all the Hash codes are same, how is the collision handled internally? </blockquote> The colliding elements are added into another container holding all values that hashed to that bucket. That container is usually a linked list, but there's nothing stopping an implementation using e.g. a binary tree. <blockquote> My assumption would be totally wrong if the hash code is unique to every key. In that case how is the unique hash code created where there are no collisions at all? </blockquote> unordered_map isn't required or expected to do anything special to avoid collisions. (Hash codes being "unique to every key" doesn't suffice anyway, as collisions can be created when hash codes are masked or mod-ed into the number of buckets.) <blockquote> What approach does std::unordered_map's hash function take to guarantee O(1) search? </blockquote> This is the crux of your misunderstanding. unordered_map has O(1) performance when the hash function does an adequate job of hashing the keys across the buckets. It may degrade to O(n) if the hash function is poor, or has been deliberately targeted by a malicious input of keys known to hash to the same bucket. The Standard does not require implementations to prevent that, but users can supply a cryptographic hash, pick a hash function from a family at runtime, or otherwise make it impractical for a malicious user - or similar inputs generally - to create many more collisions.

How does std::unordered_map handle collisions? [duplicate]

3 Answers

It doesn't guarantee O(1), it's O(1) on average... Worst case it can be O(n) when there are a lot of collisions. Please see link below, for more info:

https://stackoverflow.com/a/2771398/5874704

Update

Since the question has been edited, and now asks specifically about collisions for std::unordered_map, please have a look at the following answer:

https://stackoverflow.com/a/21519560/5874704

I think we can conclude that all practical implementations of std::unordered_set (or unordered_map) almost certainly use collision chaining. While it might be (just barely) possible to meet the requirements using linear probing or double hashing, such an implementation seems to lose a great deal and gain nearly nothing in return.

117

answered Oct 15 '22 11:10

A.Fagrell

There was an omission from your post that is crucial to understand: std::unordered_map has average-case O(1) search. It can take up to O(n) in the number of elements in the map to retrieve the element.

As for which hash function it uses - this is up to the user. By default it uses std::hash.

The only requirement on the hashing function with respect to collision handling is

Hash functions are only required to produce the same result for the same input within a single execution of a program; this allows salted hashes that prevent collision DoS attacks. (cppreference)

answered Oct 15 '22 11:10

erip

std::unordered_map guarantees O(1) time search, but how does it manage collision?

It uses open addressing / separate chaining, see here.

Cppreference claims

Unordered map is an associative container that contains key-value pairs with unique keys. Search, insertion, and removal of elements have average constant-time complexity.

Assuming a situation where all the Hash codes are same, how is the collision handled internally?

The colliding elements are added into another container holding all values that hashed to that bucket. That container is usually a linked list, but there's nothing stopping an implementation using e.g. a binary tree.

My assumption would be totally wrong if the hash code is unique to every key. In that case how is the unique hash code created where there are no collisions at all?

unordered_map isn't required or expected to do anything special to avoid collisions. (Hash codes being "unique to every key" doesn't suffice anyway, as collisions can be created when hash codes are masked or mod-ed into the number of buckets.)

What approach does std::unordered_map's hash function take to guarantee O(1) search?

This is the crux of your misunderstanding. unordered_map has O(1) performance when the hash function does an adequate job of hashing the keys across the buckets. It may degrade to O(n) if the hash function is poor, or has been deliberately targeted by a malicious input of keys known to hash to the same bucket. The Standard does not require implementations to prevent that, but users can supply a cryptographic hash, pick a hash function from a family at runtime, or otherwise make it impractical for a malicious user - or similar inputs generally - to create many more collisions.

answered Oct 15 '22 10:10

Tony Delroy

Related questions
                            
                                Advantages of forwarding references as functor arguments
                            
                                Random output different between implementations
                            
                                Why fork() use the same variable but different value?
                            
                                Explicit operator= call (T::operator=)
                            
                                Is using pointers in C++ always bad?
                            
                                What exactly goes to a header file and what in implementation file in C++?
                            
                                Linker Error Building GDAL
                            
                                OpenMP error invalid controlling predicate [duplicate]
                            
                                QSignalSpy wait and two signals
                            
                                Variadic templates - how can I create type, that stores passed arguments
                            
                                Can references cause memory leaks? [duplicate]
                            
                                In C++: why does a constructor get called when an array of objects is declared?
                            
                                std::ref and swap function does not seem to work well together
                            
                                Vector of smart pointers destructor call
                            
                                'make_shared' is ambiguous
                            
                                Getting template type at runtime
                            
                                Errors when linking to protobuf 3 on MS Visual C
                            
                                How to check if a member name (variable or function) exists in a class, with or without specifying type?
                            
                                Define a feed_dict in c++ for Tensorflow models
                            
                                why does the standard let me free-store allocate classes without destructors?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does std::unordered_map handle collisions? [duplicate]

Tags:

c++

hashmap

c++11

unordered-map

user2256825

People also ask

3 Answers

A.Fagrell

erip

Tony Delroy

Recent Activity

Donate For Us