Why is std::hash not guaranteed to be deterministic?

Tags:

Hereafter, we use N4140 (C++14 Standard).

According to § 17.6.3.4 Hash requirements,

The value returned shall depend only on the argument k for the duration of the program.

[ Note: Thus all evaluations of the expression h(k) with the same value for k yield the same result for a given execution of the program. — end note ]

and § 20.9.12 Class template hash says

...

the instantiation hash<Key> shall:

(1.1) — satisfy the Hash requirements (17.6.3.4) ...

(1.2) — ...

This means a hash value of value (i.e. hash<decltype(value)>(value)) may take a different value if you restart the program.

But why? This limitation was not in the Standard of C++11, but in the Standard of C++14, C++17 and C++20. As a user (not a STL developer), it would be quite useful if std::hash were deterministic. Are there any mathematical difficulties in implementing a deterministic hash function? But hash functions we daily use (e.g. deprecated md5sum or safer sha256) are all deterministic. Is there a problem of efficiency?

579

asked Mar 06 '20 16:03

ynn

2 Answers

There is no need for the hash function to be deterministic between runs, but you can still provide your own hash, e.g. for unordered containers if it's a behavior you rely on.

As for why, cppreference says:

Hash functions are only required to produce the same result for the same input within a single execution of a program; this allows salted hashes that prevent collision denial-of-service attacks.

If the Hash requirements tells it to be deterministic, then you wouldn't be able to provide a salted hash without breaking the requirement.

Here is the actual explanation why

181

answered Sep 20 '22 22:09

Geoffroy

This answer (and links in it) suggested by @NathanOliver is ultimately helpful. Let me cite important parts.

For a non-cryptographic hash function, it's possible to pre-calculate massive inputs with the same hashed value to algorithmically slow down the unordered containers, and results in a denial-of-service attack.

(from Issue 2291. std::hash is vulnerable to collision DoS attack)

For this reason, language designers are migrating to random hashing. In random hashing, the hash value of the string “a” can change every time you run your program. Random hashing is now the default in Python (as of version 3.3), Ruby (as of version 1.9) and Perl (as of version 5.18).

(from Do you realize that you are using random hashing?)

Move to Ready, rather than Immediate, as even the permission has been contentious in reflector discussion

(from Issue 2291. std::hash is vulnerable to collision DoS attack)

In practice, as far as I understand, no implementation of std::hash implements random hashing but you can write your own my::secure_hash.

(from this answer)

P.S.

I just googled "hash table dos" and found an informative page: The moment when you realize every server in the world is vulnerable.

answered Sep 20 '22 22:09

ynn

Related questions
                            
                                How can a conforming C++ implementation indicate that it doesn't know the current date and time?
                            
                                Are variadic constructors supposed to hide the implicitly generated ones?
                            
                                enable_shared_from_this and inheritance
                            
                                Is void{} legal or not?
                            
                                Is it undefined behavior to iterate over an std::array initialized in the for-loop?
                            
                                Displaying SVG in OpenGL without intermediate raster
                            
                                Format C/C++ code on save
                            
                                Move value from local stack to heap? (C++)
                            
                                Is there a Python equivalent for C++ "multiset<int>"?
                            
                                Purpose of reference template arguments
                            
                                Python-style integer division & modulus in C
                            
                                How do you get the icon, MIME type, and application associated with a file in the Linux Desktop?
                            
                                How do take a screenshot correctly with xlib?
                            
                                Class can't access its own private static constexpr method - Clang bug?
                            
                                Debugging embedded Lua
                            
                                When virtual inheritance IS a good design?
                            
                                Build server / continuous integration recommendation for C++ / Qt-based projects
                            
                                How to move from std::optional<T>
                            
                                Doxygen @param direction arguments [in],[out],[in,out] example output
                            
                                Are there any realistic use cases for `decltype(auto)` variables?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is std::hash not guaranteed to be deterministic?

Tags:

c++

std

language-lawyer

hash

ynn

People also ask

2 Answers

Geoffroy

ynn

Recent Activity

Donate For Us