Why aren't polymorphic type variables inferred?

I’ve been implementing a finger tree the past few days (yes, I know there is one in batteries). It has a type like this:

type 'a node =
  | Node2 of 'a * 'a
  | Node3 of 'a * 'a * 'a

type 'a finger_tree =
  | Empty
  | Single of 'a
  | Deep of 'a list * 'a node finger_tree * 'a list

As you can see, the Deep member of finger_tree contains a reference to a finger tree of a different type, which in turn means that functions operating on these objects recursively use polymorfic recursion; i.e. the recursive call will be to a type that is different than the input type. For example:

  let rec fold_left : 'a. ('b -> 'a -> 'b) -> 'b -> 'a finger_tree -> 'b =
    fun f acc t -> match t with
      | Empty -> acc
      | Single x -> f acc x
      | Deep (pr,mid,sf) ->
        let f' = List.fold_left f
        and f'' = fold_left (Node.fold_left f) in
        f' (f'' (f' acc pr) mid) sf

I understand why the polymorphic annotation is needed here ('a. in the type signature—which is not well documented, by the way) in terms of OCaml’s normal behavior. What I don’t understand is why the types necessary for polymorphic recursion cannot be inferred. Why is that?

2 Likes

Type inference of polymorphic recursion is undecidable. (Source: Wikipedia, IIRC also the Okasaki book.) You could probably add some heuristics to the type inference engine for some common cases, but that would be complicated, I’d bet it would cause much worse type errors, and people don’t use polymorphic recursion all that often (even though it is cool and useful!). Haskell, and then OCaml, choose to just not and require type annotations when programmers actually mean polymorphic recursion.

1 Like

Seems to be documented here: OCaml - Polymorphism and its limitations

Is anything missing?

1 Like

I guess I was looking in the wrong spot: OCaml - Types. :expressionless:

That’s a good answer. Why don’t we do it like Haskell, where simply adding the type annotation allows for polymorphic type variables? Why do we need a.?

Maybe this is a stupid question, since OCaml does many things differently than Haskell and tends to favor explicitness and verbosity (relatively speaking).

The direct reason is right above the section @yawaramin linked: normal type variable instantiation is shared with the recursive calls (so it’s trying to unify 'a with 'a node), while explicitly polymorphic type variables are instantiated per call.

I’m not sure why it does it this way, though; I’d guess it makes type checking significantly faster or simpler, or is connected with some type inference improvements? Maybe it makes some features OCaml has that Haskell doesn’t work better?

3 Likes

This is exactly what the annotation

let rec fold_left : 'a. ('b -> 'a -> 'b) -> 'b -> 'a finger_tree -> 'b

is doing with an explicit universal quantification of the type variable 'a.

Without such quantification, OCaml type variables are variables that scope other the whole definition (aka unification type variables) which allow to write

val f: (Format.formatter -> int -> unit as 'printer) -> 'printer

or

module M = struct type t = A | B end
let z =
  let y : 'a = M.A in
  (y, (B:'a))

It is debatable that this is the optimal default behavior, but it is at least an explicit one.

2 Likes