Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the use for buckets interface in std::unordered_map?

Tags:

c++

c++11

stl

I've been watching this video from CppCon 2014 and discovered that there is an interface to access buckets underneath std::unordered_map. Now I have a couple of questions:

  • Are there any reasonable examples of the usage of this interface?
  • Why did the committee decide to define this interface, why typical STL container interface wasn't enough?
like image 655
Kostya Avatar asked Jun 29 '15 20:06

Kostya


3 Answers

It is often enlightening to search for the proposal that introduced an item, as there is often an accompanying rationale. In this case N1443 says this:

G. Bucket Interface

Like all standard containers, each of the hashed containers has member function begin() and end(). The range [c.begin(), c.end()) contains all of the elements in the container, presented as a flat range. Elements within a bucket are adjacent, but the iterator interface presents no information about where one bucket ends and the next begins.

It's also useful to expose the bucket structure, for two reasons. First, it lets users investigate how well their hash function performs: it lets them test how evenly elements are distributed within buckets, and to look at the elements within a bucket to see if they have any common properties. Second, if the iterators have an underlying segmented structure (as they do in existing singly linked list implementations), algorithms that exploit that structure, with an explicit nested loop, can be more efficient than algorithms that view the elements as a flat range.

The most important part of the bucket interface is an overloading of begin() and end(). If n is an integer, [begin(n), end(n)) is a range of iterators pointing to the elements in the nth bucket. These member functions return iterators, of course, but not of type X::iterator or X::const_iterator. Instead they return iterators of type X::local_iterator or X::const_local_iterator. A local iterator is able to iterate within a bucket, but not necessarily between buckets; in some implementations it's possible for X::local_iterator to be a simpler data structure than X::iterator. X::iterator and X::local_iterator are permitted to be the same type; implementations that use doubly linked lists will probably take advantage of that freedom.

This bucket interface is not provided by the SGI, Dinkumware, or Metrowerks implementations. It is inspired partly by the Metrowerks collision-detection interface, and partly by earlier work (see [Austern 1998]) on algorithms for segmented containers.

like image 147
Howard Hinnant Avatar answered Sep 28 '22 07:09

Howard Hinnant


I imagine you can benefit greatly from this if you're in a high performance situation and collisions end up killing you. Iterating the buckets and looking @ the bucket size periodically could tell you if your hashing policy is good enough.

Unordered maps are greatly dependent on their hashing policy when it comes to performance.

like image 26
NG. Avatar answered Sep 28 '22 07:09

NG.


The only reason I have ever needed the interface is to traverse all the objects in a map without having to hold a lock on the map or copy the map. This can be used for imprecise expiration or other types of periodic checks on objects in the map.

The traverse works as follows:

  1. Lock the map.

  2. Begin traversing the map in bucket order, operating on each object you encounter.

  3. When you decide you've held the lock for too long, stash the key of the object you last operated on.

  4. Wait until you wish to resume operating.

  5. Lock the map, and go to step 2, starting at or near (in bucket order) the key you stopped on. If you reach the end, start back at the beginning.

like image 42
David Schwartz Avatar answered Sep 28 '22 07:09

David Schwartz