Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Good hash function for a 2d index

Tags:

c++

hash

I have a struct called Point. Point is pretty simple:

struct Point
{
    Row row;
    Column column;

    // some other code for addition and subtraction of points is there too
}

Row and Column are basically glorified ints, but I got sick of accidentally transposing the input arguments to functions and gave them each a wrapper class.

Right now I use a set of points, but repeated lookups are really slowing things down. I want to switch to an unordered_set.

So, I want to have an unordered_set of Points. Typically this set might contain, for example, every point on a 80x24 terminal = 1920 points. I need a good hash function. I just came up with the following:

struct PointHash : public std::unary_function<Point, std::size_t>
{
    result_type operator()(const argument_type& val) const
    {
        return val.row.value() * 1000 + val.col.value();
    }
};

However, I'm not sure that this is really a good hash function. I wanted something fast, since I need to do many lookups very quickly. Is there a better hash function I can use, or is this OK?

like image 773
rlbond Avatar asked Apr 14 '10 03:04

rlbond


People also ask

How do you find a good hash function?

There are four main characteristics of a good hash function: 1) The hash value is fully determined by the data being hashed. 2) The hash function uses all the input data. 3) The hash function "uniformly" distributes the data across the entire set of possible hash values.

What function can serve as a good hash function?

If you just want to have a good hash function, and cannot wait, djb2 is one of the best string hash functions i know. it has excellent distribution and speed on many different sets of keys and table sizes. you are not likely to do better with one of the "well known" functions such as PJW, K&R[1], etc. Also see tpop pp.

What is an index in the hash function?

What is a hash index? Basically, a hash index is an array of N buckets or slots, each one containing a pointer to a row. Hash indexes use a hash function F(K, N) in which given a key K and the number of buckets N , the function maps the key to the corresponding bucket of the hash index.

How can hashing be used to construct an index?

The most popular use for hashing is the implementation of hash tables. A hash table stores key and value pairs in a list that is accessible through its index. Because key and value pairs are unlimited, the hash function will map the keys to the table size. A hash value then becomes the index for a specific element.

What is a hash index?

Hash indexes are one of many ways to organize data in a database. They work by taking input and using it as a key for storage on a disk. These keys, or hash values, can be anything from string lengths to characters in the input. Hash indexes are most commonly used when querying specific inputs with specific attributes.

What is a hash function?

In simple terms, a hash function maps a big number or string to a small integer that can be used as the index in the hash table. What is meant by Good Hash Function? A good hash function should have the following properties: Efficiently computable. Should uniformly distribute the keys (Each table position equally likely for each key)

What are the two heuristic methods for creating hash functions?

The two heuristic methods are hashing by division and hashing by multiplication which are as follows: In this method for creating hash functions, we map a key into one of the slots of table by taking the remainder of key divided by table_size. That is, the hash function is

Which hash function is not suitable for hashing a key?

Thus, a hash function that simply extracts a portion of a key is not suitable. Similarly, if two keys are simply digited or character permutations of each other (such as 139 and 319), they should also hash into different values. The two heuristic methods are hashing by division and hashing by multiplication which are as follows:


1 Answers

Following the technique is given in Effective Java (2nd edition), and quoted from there in Programming in Scala. Have a prime constant (we'll say 53 but you may find something larger will give more even distribution here), and perform multiplication and addition as follows:

(53 + int_hash(row)) * 53 + int_hash(col)

For more values (say you add a z coordinate), just keep nesting, like

((53 + int_hash(row)) * 53 + int_hash(col)) * 53 + int_hash(z)

Where int_hash is a function for hashing a single integer. You can visit this page to find a bunch of good hash functions for single integers.

like image 144
Ken Bloom Avatar answered Sep 20 '22 08:09

Ken Bloom