Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does unordered_set determine the inserting order in c++?

I know that people use unordered_set when they don't care about the order of the elements in the set. However, when I run the sample program on C++ Shell

#include <iostream>
#include <unordered_set>
#include <string>

int main()

{
std::unordered_set<std::string> inputSet;
inputSet.insert("Hello world");
inputSet.insert("Abcdef");
inputSet.insert("This is the test string...");

for(const auto &val : inputSet)
  std::cout << val.c_str() << std::endl;

return 0;}

it gives me

This is the test string...
Abcdef
Hello world

And I tried to run it for 3 or 4 times, it still gives me the same output which implies that there is a way that unordered_set determine the inserting order.

Can someone explain how does unordered_set determine the inserting order?

Sorry if it has been asked before, I've searched online for a while and I cannot find a specific answer to this question. Thanks in advance.

like image 965
user2185071 Avatar asked Dec 19 '22 10:12

user2185071


2 Answers

There is no specific ordering... It uses the default std::hash to hash the string. And whatever the hash value is, it is converted into an appropriate bucket index in the container..

The hash value we are talking about can be gotten:

auto hello = std::hash<std::string>()("Hello world");
auto abcd = std::hash<std::string>()("Abcdef");
auto test = std::hash<std::string>()("This is the test string...");

For a particular STL implementation, this resolves to:

Hello maps to: 14420674105493498572
abcd maps to: 10830572898531769673
test maps to: 13068738153895491918

See it Live on C++Shell

The value is usually converted to an appropriate bucket index by applying % operator. Again the std::unordered_set's iterator isn't mandated to sequentially iterate through all the buckets (what about collisions?). So, you should not rely on any ordering you observe from the iterators between program runs.


From C++14, std::hash<> is explicitly permitted to produce different results between different program runs. To quote:

Hash functions are only required to produce the same result for the same input within a single execution of a program; this allows salted hashes that prevent collision DoS attacks.

like image 144
WhiZTiM Avatar answered Dec 24 '22 01:12

WhiZTiM


As stated here http://en.cppreference.com/w/cpp/container/unordered_set

Internally, the elements are not sorted in any particular order, but organized into buckets. Which bucket an element is placed into depends entirely on the hash of its value. This allows fast access to individual elements, since once a hash is computed, it refers to the exact bucket the element is placed into.

So it either uses a default or user provided hash algorithm to sort into hash buckets.

like image 39
Mikel F Avatar answered Dec 24 '22 01:12

Mikel F