Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hash function providing unique uint from an integer coordinate pair

Tags:

hashtable

hash

The problem in general: I have a big 2d point space, sparsely populated with dots. Think of it as a big white canvas sprinkled with black dots. I have to iterate over and search through these dots a lot. The Canvas (point space) can be huge, bordering on the limits of int and its size is unknown before setting points in there.

That brought me to the idea of hashing:

Ideal: I need a hash function taking a 2D point, returning a unique uint32. So that no collisions can occur. You can assume that the number of dots on the Canvas is easily countable by uint32.

IMPORTANT: It is impossible to know the size of the canvas beforehand (it may even change), so things like

canvaswidth * y + x

are sadly out of the question.

I also tried a very naive

abs(x) + abs(y)

but that produces too many collisions.

Compromise: A hash function that provides keys with a very low probability of collision.

Any ideas anybody? Thanks for any help.

Best regards, Andreas T.

Edit: I had to change something in the question text: I changed the assumption "able to count the number of points of the canvas with uint32" into "able to count the dots on the canvas (or the number of coordinate pairs to store" by uint32. My original question didn't make much sense, because I would have had a sqrt(max(uint32))xsqrt(max(uint32)) sized canvas, which is uniquely representable by a 16 bit shift and OR.

I hope this is ok, since all answers still make most sense with the updated assumptions

Sorry for that.

like image 410
AndreasT Avatar asked Mar 25 '09 16:03

AndreasT


People also ask

What is unique hash function?

A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table.

Can a hash function hash an integer key?

Hash FunctionsA hash function maps each key to an integer in the range [0, N -1], where N is the capacity of the bucket array for the hash table.

Are hash outputs unique?

A hash function is a versatile one-way cryptographic algorithm that maps an input of any size to a unique output of a fixed length of bits. The resulting output, which is known as a hash digest, hash value, or hash code, is the resulting unique identifier we mentioned earlier.

What is a good hash function for integers?

A good hash function to use with integer key values is the mid-square method. The mid-square method squares the key value, and then takes out the middle r bits of the result, giving a value in the range 0 to 2r−1. This works well because most or all bits of the key value contribute to the result.


1 Answers

Cantor's enumeration of pairs

   n = ((x + y)*(x + y + 1)/2) + y 

might be interesting, as it's closest to your original canvaswidth * y + x but will work for any x or y. But for a real world int32 hash, rather than a mapping of pairs of integers to integers, you're probably better off with a bit manipulation such as Bob Jenkin's mix and calling that with x,y and a salt.

like image 78
Pete Kirkham Avatar answered Sep 28 '22 02:09

Pete Kirkham