this is my first question on these forums : )
I'm writing a coordinate class in Java for a spatial octree voxel system. These coordinates are not floating point coordinates, they are 4D integer indexes into the octree (3 normal dimensions X, Y, Z, and a forth for depth into the tree). The first 3 values are all shorts, the last dimension is a byte. In actual use right now only the first 11 bits of the shorts are used and only 3 bits of the byte, but this could be subject to change.
Now I'm trying to write a 'good' hash function for this class. The problem I'm wrestling with is that the coordinates are often going to be used in highly spatial coherent situations (hope I'm using the right terminology there). What I mean is that often times a coordinate will be hashed along with its immediately adjacent neighbors and other nearby coordinates.
Is there an effective practice to cause these 'near to each other' coordinates to produce significantly different hashcodes?
You are in luck: there is a way to get decent co-ordinate encodings with high spatial coherence using something called a Z-order curve.
The trick is to interleave the bits of the different co-ordinate components. So if you have 3 8-bit co-ordinates like:
[XXXXXXXX, YYYYYYYY, ZZZZZZZZ]
Then the z-curve encoded value would be a single 24-bit value:
XYZXYZXYZXYZXYZXYZXYZXYZ
You can extend to larger numbers of bits or co-ordinates as required.
This encoding works because co-ordinates which are close in space will have differences mainly in the lower order bits. So by interleaving the co-ordinates, you get the differences focused in the lower-order bits of the encoded value.
An extra interesting property is that the lower bits describe co-ordinates within cubes of space. So the lowest 3 bit address position with 2x2x2 cubes, the lowest 6 bits address position within 4*4*4 cubes, the lowest 9 bits position within 8*8*8 cubes etc. So this is actually a pretty ideal system for addressing co-ordinates within an octree.
"Significantly different" really depends on what you're doing with the hash code afterwards. In some cases it will then be subject to a round-robin bucket pick by taking the hash % size
where size
is the size of the hash map you're using, for example. Obviously that will change over time. I'd usually use something like:
int hash = 23;
hash = hash * 31 + x;
hash = hash * 31 + y;
hash = hash * 31 + z;
hash = hash * 31 + depth;
return hash;
(This is cribbed from Effective Java, basically.) Obviously it means that (x1, y1, z1)
and (x1 + 1, y1 - 31, z1)
would have the same hash code, but if you're mostly worried about very near neighbours it shouldn't be a problem.
EDIT: mikera's answer is likely to work better but be more complicated to code. I would personally try this very simple approach first, and see whether it's good enough for your actual use cases. Use progressively more effective but complicated approaches until you find one which is good enough.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With