Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When do you use std::unordered_map::emplace_hint?

Tags:

I know how to use std::unordered_map::emplace, but how do I use emplace_hint? Neither cplusplus nor cppreference provide a set of examples that illustrate how we might know where to put the element.

Can anyone provide some information on this or give some examples / illustrations of when we might know where the emplaced element should go?

like image 292
quant Avatar asked Apr 02 '14 02:04

quant


1 Answers

What could an unordered_map potentially do with the hint? Well, if the iterator addresses an element with the same key as the element that emplace_hint has been asked to insert, then it can fail quickly - just a key comparison without any hashing or groping through any list of hash-colliding elements at that bucket. But if the key doesn't match, then the hint is otherwise useless because any other key - no matter how "close" in value - should (probabilistically) be at a completely unrelated bucket (given what's normally considered a "good" hash function), so time would have been wasted on a key comparison only to have to start over as if it were a normal emplace.

This might be useful when you're inserting pre-sorted-by-key elements, aiming to remove lots of duplicates in the process, but the key is so huge it's easier to keep an iterator to the just-inserted element than a copy of the key, or perhaps the hash function is particularly slow.

Another benefit of unordered_map::emplace_hint is better API compatibility with map::emplace_hint, so code can switch the container type and have the emplace_hints not break the compile, though they might end up slower than if the code were switched to emplace() as the close-but-different-key hints that help with a map may be useless with an unordered_map.


Just taking GCC 10.2 g++ -E output to see if it does what's described above. emplace_hint calls down into _M_insert_multi_node(...) wherein there's this line:

__node_base* __prev = __builtin_expect(__hint != nullptr, false)                       && this->_M_equals(__k, __code, __hint)                       ? __hint                       : _M_find_before_node(__bkt, __k, __code); 

Above, __k is the key that may be inserted, __code is the hash code, __hint is the hint iterator/pointer; _M_equals(...) returns:

return _Equal_hash_code<__node_type>::_S_equals(__c, *__n) &&        _M_eq()(__k, this->_M_extract()(__n->_M_v())); 

So, it's checking that the hash codes are equal (a quick and dirty check if you've already calculated the hashes) and the keys are equal (a potentially slower operation for e.g. a quality hash of long strings) before using the hint iterator. That's the only case in which it uses the hint. Imagine the bucket logically has some colliding elements chained off it with keys K1, K2, K3, K4 and your hint iterator is to K4 but you're trying to insert a duplicate with K2: as the iterators are forward only, you have to use _M_find_before_node(...) to reach colliding elements earlier in the chain than your hint points to. After the _M_find_before_node(...) you can scan from K1 forwards to see if the key to insert - K2 - is already present in the elements that collided at the bucket.

(The implementation could be improved by skipping the hash comparison when the key comparison is known to be cheap, but getting that condition right with type traits would be a bit of a pain - how do you know which key equality functions are cheap? Could assume so for small, standard layout, trivially copyable types or similar, at least when the unordered container is instantiated with the default std::equals<> comparison....).

like image 88
Tony Delroy Avatar answered Sep 23 '22 16:09

Tony Delroy