Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the unordered_map really unordered?

I am very confused by the name 'unordered_map'. The name suggests that the keys are not ordered at all. But I always thought they are ordered by their hash value. Or is that wrong (because the name implies that they are not ordered)?

Or to put it different: Is this

typedef map<K, V, HashComp<K> > HashMap;

with

template<typename T>
struct HashComp {
    bool operator<(const T& v1, const T& v2) const {
        return hash<T>()(v1) < hash<T>()(v2);
    }
};

the same as

typedef unordered_map<K, V> HashMap;

? (OK, not exactly, STL will complain here because there may be keys k1,k2 and neither k1 < k2 nor k2 < k1. You would need to use multimap and overwrite the equal-check.)

Or again differently: When I iterate through them, can I assume that the key-list is ordered by their hash value?

like image 874
Albert Avatar asked Jul 04 '10 23:07

Albert


2 Answers

In answer to your edited question, no those two snippets are not equivalent at all. std::map stores nodes in a tree structure, unordered_map stores them in a hashtable*.

Keys are not stored in order of their "hash value" because they're not stored in any order at all. They are instead stored in "buckets" where each bucket corresponds to a range of hash values. Basically, the implementation goes like this:

function add_value(object key, object value) {
   int hash = key.getHash();

   int bucket_index = hash % NUM_BUCKETS;
   if (buckets[bucket_index] == null) {
       buckets[bucket_index] = new linked_list();
   }
   buckets[bucket_index].add(new key_value(key, value));
}

function get_value(object key) {
   int hash = key.getHash();

   int bucket_index = hash % NUM_BUCKETS;
   if (buckets[bucket_index] == null) {
       return null;
   }

   foreach(key_value kv in buckets[bucket_index]) {
       if (kv.key == key) {
           return kv.value;
       }
   }
}

Obviously that's a serious simplification and real implementation would be much more advanced (for example, supporting resizing the buckets array, maybe using a tree structure instead of linked list for the buckets, and so on), but that should give an idea of how you can't get back the values in any particular order. See wikipedia for more information.


* Technically, the internal implementation of std::map and unordered_map are implementation-defined, but the standard requires certain Big-O complexity for operations that implies those internal implementations

like image 118
Dean Harding Avatar answered Sep 25 '22 20:09

Dean Harding


"Unordered" doesn't mean that there isn't a linear sequence somewhere in the implementation. It means "you can't assume anything about the order of these elements".

For example, people often assume that entries will come out of a hash map in the same order they were put in. But they don't, because the entries are unordered.

As for "ordered by their hash value": hash values are generally taken from the full range of integers, but hash maps don't have 2**32 slots in them. The hash value's range will be reduced to the number of slots by taking it modulo the number of slots. Further, as you add entries to a hash map, it might change size to accommodate the new values. This can cause all the previous entries to be re-placed, changing their order.

In an unordered data structure, you can't assume anything about the order of the entries.

like image 28
Ned Batchelder Avatar answered Sep 22 '22 20:09

Ned Batchelder