Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How was `val hash : 'a -> int` was implemented in OCaml?

In OCaml, Hashtbl can hash any thing to an int

Hashtbl.hash x associates a nonnegative integer to any value of any type. It is guaranteed that if x = y or Pervasives.compare x y = 0, then hash x = hash y. Moreover, hash always terminates, even on cyclic structures.

I mean, in Java, we have hashCode() for every object which returns an integer and Java's Hashtable can hash that integer.

But how did OCaml achieve that to hash anything?

like image 478
Jackson Tale Avatar asked Mar 22 '13 17:03

Jackson Tale


1 Answers

It's not tricky. Hashtbl.hash just traverses data in a manner similar to the way the garbage collector does it. It travels a fixed distance into the linked structure, which avoids failing when there are cycles. It doesn't know anything about the high-level types of what it encounters, it just hashes the primitive values it reaches.

You can see the code in byterun/hash.c in the OCaml source distribution.

Update

It occurs to me you might have been asking how Hashtbl.hash was implemented in OCaml. The answer is that it can't be implemented in OCaml (without cheating) because it violates parametricity. The only possible pure functions of type 'a -> int are ones that return a constant value. The intuition is that such a function can't use any information about its argument because it is defined for all possible types.

Hashtbl.hash is one of a few OCaml functions that violate parametricity. They exist in OCaml because they're extremely handy. Another notorious one is the polymorphic comparison compare (and the associated = operator).

like image 151
Jeffrey Scofield Avatar answered Nov 05 '22 12:11

Jeffrey Scofield