Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find bucket in unordered_map from hash without a key

Tags:

c++

c++11

stl

I am using a std::unordered_map. I have a hash value and a way to determine if a given candidate key is the key that I am looking for, but I do not have an actual key. I want to look up the bucket corresponding to the hash value and go through each element in that bucket to see if it is the element that I am looking for. Unfortunately, the function std::unordered_map::bucket(x) requires x to be a key. Is there really no way to get a bucket from a hash value without first constructing a key?

Details that you don't need to answer the question: I could construct the key but in the common case of no collisions this will take longer than only checking if the single candidate I have found in the bucket is the right one. I have a low load factor so there are few collisions and even for a collision the full hash value is unlikely to match, so non-matches are quickly determined not to match. I care about this because I have determined with a profiler that key construction is taking a significant amount of time - there are many lookups and each lookup requires the construction of a key.

Even more details that you really don't need to answer the question: The keys are vectors of integers and my query is for the sum of two vectors. It is faster to check if a given vector V is the sum of two vectors A and B than to sum the two vectors into a third vector C=A+B and then compare C to V. I am able to determine the hash value of A+B without calculating the actual vector A+B because I store the hash values of these vectors and my hash function f has the property that f(A+B)=f(A)+f(B). So I just add the two stored hash values to get the hash value of the sum. I have already made sure to keep a spare vector around so that constructing a key does not require memory allocation but the code for adding the vectors is still taking a significant amount of time on its own.

like image 668
Bjarke H. Roune Avatar asked Oct 15 '12 16:10

Bjarke H. Roune


People also ask

What does an unordered_map return if the key does not exist?

Return value the unordered map will try to default initialize mystruct with the key 'x' and assign to my struct. if u wanna avoid this, use . at(key) if it doesn't exist it will throw an out_of_range exception , which you can catch it and handle.

Which hash function is used in unordered_map?

The unordered_map::hash_function() is a built in function in C++ STL which is used to get the hash function. This hash function is a unary function which takes a single argument only and returns a unique value of type size_t based on it.

Does unordered_map store the key?

unordered_map is an associated container that stores elements formed by the combination of a key value and a mapped value. The key value is used to uniquely identify the element and the mapped value is the content associated with the key.

What is a bucket in unordered_map?

C++ Unordered_map Library - bucket() Function Bucket is a memory space in the container's hash table to which elements are assigned based on the hash value of their key. Valid range of buckets is from 0 to bucket_count - 1.


1 Answers

You cannot avoid constructing a key, but you can avoid constructing the entire key.

For example, let's say that you have a key class VectorKey that encapsulates an std::vector, and caches the computed hash code. Further suppose that you provide implementations of Hash and KeyEqual that access the cached hash code off your VectorKey, and compare encapsulated vectors for equality. You can define a constructor of VectorKey that always constructs an empty std::vector, and sets the cached hash code to a value passed to the constructor:

class VectorKey{
    int cached_hash;
    std::vector<int> key;
public:
    VectorKey(const std::vector<int>& _key)
    :    key(_key)
    ,    cached_hash(calc_hash(_key)) {
    }
    // *** This is the centerpiece of the solution: *** 
    // *** this constructor effectively lets you access *** 
    // *** a bucket with nothing more than a hash code. *** 
    VectorKey(int hash)
    :    cached_hash(hash) {
    }
    // More code goes here for getting cached_hash
    // and also for checking equality
private:
    int calc_hash(const std::vector<int>& _key) {
         // calculate the hash code based on the vector
    }
};

With a key class like that, you can quickly find buckets by constructing a fake key:

size_type bucketIndex = myHashMap.bucket(VectorKey(precalculated_hash));
like image 95
Sergey Kalinichenko Avatar answered Oct 18 '22 23:10

Sergey Kalinichenko