Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hash function for a vector of pair<int, int>

Tags:

c++

hash

vector

I'm trying to implement an unordered_map for a vector< pair < int,int> >. Since there's no such default hash function, I tried to imagine a function of my own :

struct ObjectHasher
{
  std::size_t operator()(const Object& k) const
  {
    std::string h_string("");
    for (auto i = k.vec.begin(); i != k.vec.end(); ++i)
    {
      h_string.push_back(97+i->first);
      h_string.push_back(47); // '-'
      h_string.push_back(97+i->second);
      h_string.push_back(43); // '+'
    }
    return std::hash<std::string>()(h_string);
  }
};

The main idea is to change the list of integers, say ( (97, 98), (105, 107) ) into a formatted string like "a-b+i-k" and to compute its hash thanks to hash < string >(). I choosed the 97, 48 and 43 numbers only to allow the hash string to be easily displayed in a terminal during my tests.

I know this kind of function might be a very naive idea since a good hash function should be fast and strong against collisions. Well, if the integers given to push_back() are greater than 255 I don't know what might happen... So, what do you think of the following questions :

  • (1) is my function ok for big integers ?
  • (2) is my function ok for all environments/platforms ?
  • (3) is my function too slow to be a hash function ?
  • (4) ... do you have anything better ?
like image 847
suizokukan Avatar asked May 25 '14 20:05

suizokukan


3 Answers

All you need is a function to "hash in" an integer. You can steal such a function from boost:

template <class T>
inline void hash_combine(std::size_t& seed, const T& v)
{
    std::hash<T> hasher;
    seed ^= std::hash<T>(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
}

Now your function is trivial:

struct ObjectHasher
{
  std::size_t operator()(const Object& k) const
  {
    std::size_t hash = 0;
    for (auto i = k.vec.begin(); i != k.vec.end(); ++i)
    {
      hash_combine(hash, i->first);
      hash_combine(hash, i->second);
    }
    return hash;
  }
};
like image 97
David Schwartz Avatar answered Nov 09 '22 23:11

David Schwartz


This function is is probably very slow compared to other hash functions since it uses dynamic memory allocation. Also std::hash<std::string> Is not a very good hash function since it is very general. It's probably better to XOR all ints and use std::hash<int>.

like image 39
TNA Avatar answered Nov 09 '22 23:11

TNA


This is a perfectly valid solution. All a hash function needs is a sequence of bytes and by concatenating your elements together as a string you are providing a unique byte representation of the map.

Of course this could become unruly if your map contains a large number of items.

like image 28
mclaassen Avatar answered Nov 09 '22 21:11

mclaassen