The documentation for Hashtbl.hash
currently reads:
Hashtbl.hash x
associates a nonnegative integer to any value of any type. It is guaranteed that ifx = y
orStdlib.compare x y = 0
, thenhash x = hash y
. Moreover, hash always terminates, even on cyclic structures.
If I’m not mistaken, this is not documenting the limit on meaningful nodes traversed. Currently one has to look at books like Real World OCaml to find that the bound is set to 10:
OCaml’s polymorphic hash function works by walking over the data structure it’s given using a breadth-first traversal that is bounded in the number of nodes it’s willing to traverse. By default, that bound is set at 10 “meaningful” nodes.
i.e. hash
of a list will ignore everything after the first 10 elements.
Without this bound documented, one may be confused by why every list with the same 10-element prefix collides. Hiding this also makes performance of Hashtables a mystery. It’s unclear from reading the docs whether Hashtables require data to be optimised to be less than 10 nodes deep.
I originally posted this as an issue: Documentation for `Hashtbl.hash` does not mention the default meaningful nodes limit · Issue #2952 · ocaml/ocaml.org · GitHub
But I was told the documentation is maintained alongside the compiler. Unfortunately I couldn’t find where to submit issues about the compiler, so I’m creating this topic instead.