Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inserting to std::unordered_map calls hash function twice in MSVC++'s STL, bad design or special reason?

For this code:

#include<unordered_map>
#include<iostream>
using namespace std;
struct myhash {
    unsigned operator()(const unsigned&v)const {
        cout<<"Hash function is called:"<<v<<endl;
        return v;
    }
};
unordered_map<unsigned,unsigned,myhash>mp;
int main() {
    for (unsigned i=0;i<3;++i) {
        cout<<"Visiting hash table:"<<i<<endl;
        ++mp[i];
    }
}

When using g++, there's no surprise with the output:

Visiting hash table:0
Hash function is called:0
Visiting hash table:1
Hash function is called:1
Visiting hash table:2
Hash function is called:2

But the output of MSVC++(2015) shocked me:

Visiting hash table:0
Hash function is called:0
Hash function is called:0
Visiting hash table:1
Hash function is called:1
Hash function is called:1
Visiting hash table:2
Hash function is called:2
Hash function is called:2

Further tests showed that MSVC++'s STL calls the hash function twice when inserting a new element in the unordered_map and once if the element is already in the map.

I feel pretty strange for this behavior because I think caching the result is much more faster than calling the hash function again under most circumstances( if the implementation really need to use the hash value twice). Or better, I think is just use the hash function once to find the bucket for the element and the following routine is not relevant to the hash value.

However, since the authors of STL are much more experienced than me in C++, I don't know if they have some special reasons for this implementation? Is this just a bad design or caused by some reasons unknown to me?

like image 737
James Avatar asked Nov 21 '15 02:11

James


2 Answers

observed the same behavior with VS2012 too, in Release mode.

the difference is in implementation of un_orderedmap. if you debug and step into: Microsoft Visual Studio 11.0\VC\include\unordered_map, the first call to function call operator is to check if key is already present in map. The second call is made during insert into map (provided the key is not found) - after 3 iterations map has ((0,1), (1,1), (2,1))

like image 159
Nandu Avatar answered Oct 16 '22 10:10

Nandu


Well if we look at 23.5.4.3 neither the effects nor complexity clauses seem to require the hash function to be called only once. It appears that while intuitively strange, both implementations are legal.

like image 34
Mark B Avatar answered Oct 16 '22 08:10

Mark B