Weird behaviors with first order polymorphism

Hello, I’m not sure if this is the right place to post this. StackOverflow might be better. Let me know if I should move it there.

I’ve been working with higher-order polymorphism lately and found some strange typechecker behaviors. They stem from using “phantom” types parameter (e.g. type 'a t = <thing> where 'a doesn’t appear in <thing>). The typechecker seems unable to unify function types like int -> string with 'a. int -> string, without annotations (it succeeds with some), but flat out fails if there is a second type parameter involved (e.g. 'b -> string and 'a. 'b -> string), no matter what annotation I add. In addition, I’ve found some really weird behaviors. Namely, code that typechecks with opaque types, but not transparent ones.

(On an unrelated note, I also stumbled across a syntax error: let _ : 'a. <thing> = ... and let _ : type a. <thing> are invalid syntax, but valid if you replace _ with _unused)

I’ll explain the “minimal” example I came up with to showcase this in the rest of this post, and provide a full .ml file if you want to replicate this. Unfortunately it is still quite long. I’ve tested this both on OCaml 4.14.1 and 5.1.1 and found no differences.

I’m working on generic sets (sets whose elements are of type 'a elt). This is useful for GADTs, for instance, since these already constraint 'a. Here is a simple signature of such a set with only an iter function.

module GenericSet (GenericElt : sig type 'a t end) : sig
  type 'a elt = 'a GenericElt.t
  type t

  type polyiter = { f : 'a. 'a elt -> unit }
  val iter : polyiter -> t -> unit
end

Problem 1: implementing iter for non-generic sets [solved]

Now the first problems cropped up when trying to use GenericSet to define a non-generic Set module. I wanted to simplify the signature since types like polyiter shouldn’t be needed anymore, but reimplementing iter proved to be very challenging:

module Set (Elt : sig type t end) : sig
  type t
  val iter : (Elt.t -> unit) -> t -> unit
end = struct
  module GElt = struct type 'a t = Elt.t end
  module GSet = GenericSet (GElt)
  type t = GSet.t

  let iter f set = GSet.iter ... set
end

Here is a short list of valid ways to use annotations to define Set.iter:

(* 1 *) let iter f set = GSet.iter { f = fun (elt : Elt.t) -> f elt } set

(* 2 *) let iter (f : Elt.t -> unit) set = GSet.iter { f } set

(* 3 *) let iter f set =
            let f : Elt.t -> unit = f in
            GSet.iter { f } set

(* 4 *) let iter f set =
            let f : type a. Elt.t -> unit = f in
            GSet.iter { f } set

And here is a list of ways that are rejected with the typechecker:

(* 5 *) let iter f set = GSet.iter { f } set

(* 6 *) let iter f set =
            let f : 'a. Elt.t -> unit = f in
            GSet.iter { f } set

(* 7 *) let iter f set = GSet.iter { f = fun (type _a) elt -> f elt } set

They all fail with the same message:

File "test.ml", line 39, characters 29-30:
39 |   let iter f x = GSet.iter { f } x
                                  ^
Error: This field value has type Elt.t -> unit which is less general than
       'a. Elt.t -> unit

The fact that 5 fails is expected, as there are no annotations. But 6 is confusing, especially considering both 3 and 4 are accepted. I found an explanation of stack overflow of the differences between type a. and 'a but can’t quite understand why they crop up here (polymorphism - What is the difference between `'a.` and `type a.` and when to use each? - Stack Overflow). Furthermore, the error message here doesn’t help, or rather, I’m not sure why we can’t unify Elt.t -> unit and 'a. Elt.t -> unit.

Using opaque types

Another solution I found is to use opaque types instead of annotations. This is a bit heavier (even without the annotations), and it requires an extra function call. Overall, I’d rather avoid it if possible.

module GEltOpaque : sig
  type 'a t
  val convert : 'a t -> Elt.t
end = struct
  type 'a t = Elt.t
  let convert x = x
end

module GSetOpaque = GenericSet (GEltOpaque)

(* 8 *) let iter f set = GSetOpaque.iter { f = fun x -> f (GEltOpaque.convert x) } set

At that point I made the minimal example, found the annotation strategies that worked, and though I’d solved my problem. Unfortunately, I quickly stumbled into a version I couldn’t solve with annotations…

Problem 2: Instanciating polyiter for generic maps [solved but unsatisfying]

Generic maps are similar to generic sets, but they have an extra type parameter for the values. So a 'b map maps keys of type 'a key to values of type ('a, 'b) values. Here is a simple interface:

module GenericMap (MapParam : sig type 'a key  type ('a, 'b) value end) : sig
  type 'a key = 'a MapParam.key
  type ('a, 'b) value = ('a, 'b) MapParam.value
  type 'b t

  type 'b polyiter = { f : 'a. 'a key -> ('a, 'b) value -> unit }
  val iter : 'b polyiter -> 'b t -> unit
end

We will also need a GADT for this bit:

module GADT = struct
  type 'a t = Int : int -> int t
  let print (type a) (Int n : a t) = Format.printf "%d@." n
end

The problem arises when using non fully generic maps (i.e. when ('a, 'b) value = 'b).

module MapParam = struct
  type 'a key = 'a GADT.t
  type ('a, 'b) value = 'b
end

module GMap = GenericMap(MapParam)

Here, I was trying to instantiate 'b polyiter to print the map keys with GADT.print. This time I just couldn’t find an annotation the type-checker would accept:

(*  9 *) let polyiter : _ GMap.polyiter = { f = fun x _ -> GADT.print x }
(* 10 *) let polyiter : _ GMap.polyiter =
             { f = fun (type a) (x : a GADT.t) (_ : (a, 'b) GMap.value) -> GADT.print x }
(* 11 *) let polyiter : _ GMap.polyiter =
             let print : type a b. a GMap.key -> (a, b) GMap.value -> unit =
                fun k _v -> GADT.print k
             in { f = print }

All fail with error message:

File "test.ml", line 155, characters 35-58:
155 | let iter : _ GMap.polyiter = { f = fun x _ -> GADT.print x }
                                         ^^^^^^^^^^^^^^^^^^^^^^^
Error: This field value has type 'b GMap.key -> 'c -> unit
       which is less general than 'a. 'a GMap.key -> 'd -> unit

This looks fairly similar to the message we got in problem 1, and shares the same concerns:
Why can’t we unify 'b GMap.key -> 'c -> unit with 'a. 'a GMap.key -> 'd -> unit?

Fortunately, the opaque type strategy still works here:

module MapParamOpaque : sig
  type 'a key
  type ('a, 'b) value
  val convert_key : 'a key -> 'a GADT.t
  val convert_value : ('a, 'b) value -> 'b
end = struct
  type 'a key = 'a GADT.t
  type ('a, 'b) value = 'b
  let convert_key k = k
  let convert_value v = v
end

module GMapOpaque = GenericMap(MapParamOpaque)

(* 12 *) let iter : _ GMapOpaque.polyiter =
             { f = fun x _ -> GADT.print (MapParamOpaque.convert_key x) }

Is there a way to correctly annotate this that I missed ? I’d really like to avoid
making types opaque and having conversion functions all over the place.

Problem 3: the same thing, but with object methods [unsolved]

Another thing I tried was using object methods instead of record fields to pass polymorphic functions to iter. I’ve hardly ever used objects in OCaml, so I’m a bit out of my depth here. It seems they don’t work with opaque types, but they do with annotations.

module GenericSet (GenericElt : sig type 'a t end) : sig
  ...

  val obj_iter : < f : 'a. 'a elt -> unit > -> t -> unit
end

To define Set.obj_iter, the annotated version works:

(* 13 *) let obj_iter (f : Elt.t -> unit) set = GSet.obj_iter (object method f x = f x end) set

However, the non annotated version and opaque types fail:

(* 14 *) let obj_iter f set = GSet.obj_iter (object method f x = f x end) set

(* 15 *) let obj_iter f set = GSetOpaque.obj_iter
            (object
              method f : type a. a GSetOpaque.t -> unit =
                fun x -> f (GEltOpaque.convert x)
            end) set

(* 16 *) let obj_iter (f : Elt.t -> unit) set = GSetOpaque.obj_iter
              (object
                method f =
                  fun x -> f (GEltOpaque.convert x)
              end) set

Its worth mentioning because this time, the error message is different:

File "test.ml", line 116, characters 37-66:
116 |   let obj_iter f set = GSet.obj_iter (object method f x = f x end) set
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: This expression has type < f : 'b -> 'c >
       but an expression was expected of type < f : 'a. Elt.t -> unit >
       The method f has type 'b -> 'c, but the expected method type was
       'a. Elt.t -> unit
       The universal variable 'a would escape its scope

Once again I have trouble understanding this error message. How is our variable escaping its scope here? It doesn’t show up anywhere except the quantifier…

Methods and iterating generic maps

As I suspected, since I could only solve problem 2 with opaque types, and they don’t work when using methods, I can’t solve it. I have tried the following:

(* 17 *) let printer map = GMap.obj_iter (object method f k v = print k end) map

(* 18 *) let printer (type b) (map : b GMap.t) = GMap.obj_iter
            (object
              method f (type a) (k : a GMap.key) (v : (a,b) GMap.value) = print k
            end) map

(* 19 *) let printer (type b) (map : b GMap.t) =
            let f : type a. a GMap.key -> (a,b) GMap.value -> unit = fun k _ -> GADT.print k in
            GMap.obj_iter (object method f = f end) map

I get a similar message about 'a escaping its scope (even though this time, it does appear in the signature). Except for 19, which produces an error message at the let f similar to those of problem 2.

Questions

In summary, my main questions are:

  • Why can’t Elt.t -> unit and 'a. Elt.t -> unit be unified? Or rather why are they sometimes unified but sometimes not?
  • Why is implementation 6 rejected?
  • Can problem 2 be solved without opaque types?
  • Why is the type-checker complaining about a variable escaping its scope when that variable doesn’t show up in the type?
  • Can problem 2 be solved with object methods (end of problem 3)?
  • Also is the syntax error on let _ : type a. ... = ... expected ?
Full example file

Apparently I can’t just upload an ml file. So here are its contents:

(** * Prerequisites *)

(** Set of ['a Elt.t] elements, usually ['a Elt.t] will be a GADT to constrain ['a] *)
module GenericSet (GenericElt : sig type 'a t end) : sig
  type 'a elt = 'a GenericElt.t
  type t

  type polyiter = { f : 'a. 'a elt -> unit }
  val iter : polyiter -> t -> unit

  val obj_iter : < f : 'a. 'a elt -> unit > -> t -> unit
  (** Same as iter, but uses objects instead of record for the argument function *)
end = struct
  (* Example implementation, the details are unimportant.
     Any module with this signatur will work *)
  type 'a elt = 'a GenericElt.t
  type any_elt = Any : 'a elt -> any_elt
  type t = any_elt list

  type polyiter = { f : 'a. 'a elt -> unit }
  let rec iter f = function
    | [] -> ()
    | Any x :: set ->
        f.f x;
        iter f set

  let rec obj_iter (f : < f : 'a. 'a elt -> unit >) = function
    | [] -> ()
    | Any x :: set ->
        f#f x;
        obj_iter f set
end

(** Map with keys in ['a MapParam.key] and values in [('a, 'b) MapParam.values] *)
module GenericMap (MapParam : sig type 'a key type ('a, 'b) value end) : sig
  type 'a key = 'a MapParam.key
  type ('a, 'b) value = ('a, 'b) MapParam.value
  type 'b t

  type 'b polyiter = { f : 'a. 'a key -> ('a, 'b) value -> unit }
  val iter : 'b polyiter -> 'b t -> unit

  val obj_iter : < f : 'a. 'a key -> ('a, 'b) value -> unit > -> 'b t -> unit
  (** Same as iter, but uses objects instead of record for the argument function *)
end = struct
  (* Example implementation, the details are unimportant.
     Any module with this signatur will work *)
  type 'a key = 'a MapParam.key
  type ('a, 'b) value = ('a, 'b) MapParam.value
  type 'b any_binding = Any : 'a key * ('a, 'b) value -> 'b any_binding
  type 'b t = 'b any_binding list

  type 'b polyiter = { f : 'a. 'a key -> ('a, 'b) value -> unit }
  let rec iter f = function
    | [] -> ()
    | Any (k,v) :: map ->
        f.f k v;
        iter f map

  let rec obj_iter (f : < f : 'a. 'a key -> ('a, 'b) value -> unit > ) = function
  | [] -> ()
  | Any (k,v) :: set ->
      f#f k v;
      obj_iter f set
end

(** A basic GADT *)
module GADT = struct
  type 'a t = Int : int -> int t
  let print (type a) (Int n : a t) = Format.printf "%d@." n
end


(** * Problem 1 *)

(** Non-generic set built using [GenericSet] with type ['a GenericElt.t = Elt.t],
    uses a simpler interface (no need for function records in this case) *)
module Set (Elt : sig type t end) =
struct
  module GElt = struct type 'a t = Elt.t end

  module GSet = GenericSet (GElt)
  type t = GSet.t

  (* 1 *) let iter f set = GSet.iter { f = fun (elt : Elt.t) -> f elt } set

  (* 2 *) let iter (f : Elt.t -> unit) set = GSet.iter { f } set

  (* 3 *) let iter f set =
              let f : Elt.t -> unit = f in
              GSet.iter { f } set

  (* 4 *) let iter f set =
              let f : type a. Elt.t -> unit = f in
              GSet.iter { f } set

  (* Invalid annotations. These raises the following error:
    > File "test.ml", line 39, characters 29-30:
    > 39 |   let iter f x = GSet.iter { f } x
    >                                   ^
    > Error: This field value has type Elt.t -> unit which is less general than
    >         'a. Elt.t -> unit
    This is somewhat expected: the typechecker can't infer the generic type here *)
  (* 5 *) let iter f set = GSet.iter { f } set

  (* 6 *) let iter f set =
              let f : 'a. Elt.t -> unit = f in
              GSet.iter { f } set

  (* 7 *) let iter f set = GSet.iter { f = fun (type _a) elt -> f elt } set


  (** Opaque types work, even without annotations *)
  module GEltOpaque : sig
    type 'a t
    val convert : 'a t -> Elt.t
  end = struct
    type 'a t = Elt.t
    let convert x = x
  end

  module GSetOpaque = GenericSet (GEltOpaque)

  (* 8 *) let iter f set = GSetOpaque.iter { f = fun x -> f (GEltOpaque.convert x) } set

  (* The object method variant only works with annotations though *)
  (* 13 *) let obj_iter (f : Elt.t -> unit) set = GSet.obj_iter (object method f x = f x end) set

  (* Without annotations, we have a new error
     > File "test.ml", line 116, characters 37-66:
     > 116 |   let obj_iter f set = GSet.obj_iter (object method f x = f x end) set
     >                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     > Error: This expression has type < f : 'b -> 'c >
     >        but an expression was expected of type < f : 'a. Elt.t -> unit >
     >        The method f has type 'b -> 'c, but the expected method type was
     >        'a. Elt.t -> unit
     >        The universal variable 'a would escape its scope *)
  (* 14 *) let obj_iter f set = GSet.obj_iter (object method f x = f x end) set

  (* Even annotated, opaque types raise this error *)
  (* 15 *) let obj_iter f set = GSetOpaque.obj_iter
    (object
      method f : type a. a GSetOpaque.t -> unit =
        fun x -> f (GEltOpaque.convert x)
    end) set

  (* 16 *) let obj_iter (f : Elt.t -> unit) set = GSetOpaque.obj_iter
    (object
      method f =
        fun x -> f (GEltOpaque.convert x)
    end) set

  (* I commented these out since syntax errors break the LSP

  (* Also this is a syntax error, which is a bit surprising.
     Its the same code as 5 but with "_f" replaced by "_".
     I was under the impression you could use an underscore as a stand-in for
     any unused variable *)
  let iter f set =
    let _ : a. Elt.t -> unit = f in
    GSet.iter { f = fun (type _a) elt -> f elt } set

  (* The "type" annotation on also results in an error *)
  let iter f set =
    let _ : type a. Elt.t -> unit = f in
    GSet.iter { f = fun (type _a) elt -> f elt } set
  *)
end

(** * Problem 2 *)

module MapParam = struct
  type 'a key = 'a GADT.t
  type ('a, 'b) value = 'b
end

module GMap = GenericMap(MapParam)

(* I can't find a way to annotate this properly *)
(*  9 *) let polyiter : _ GMap.polyiter = { f = fun x _ -> GADT.print x }
(* 10 *) let polyiter : _ GMap.polyiter =
                { f = fun (type a) (x : a GADT.t) (_ : (a, 'b) GMap.value) -> GADT.print x }
(* 11 *) let polyiter : _ GMap.polyiter =
            let print : type a b. a GMap.key -> (a, b) GMap.value -> unit =
              fun k _v -> GADT.print k
            in { f = print }

(* But the opaque trick still works though *)
module MapParamOpaque : sig
  type 'a key
  type ('a, 'b) value
  val convert_key : 'a key -> 'a GADT.t
  val convert_value : ('a, 'b) value -> 'b
end = struct
  type 'a key = 'a GADT.t
  type ('a, 'b) value = 'b
  let convert_key k = k
  let convert_value v = v
end

module GMapOpaque = GenericMap(MapParamOpaque)

(* 12 *) let iter : _ GMapOpaque.polyiter =
              { f = fun x _ -> GADT.print (MapParamOpaque.convert_key x) }

(** ** Problem 3 *)

(* I can't find a method that work with object methods here *)
(* 17 *) let printer map = GMap.obj_iter (object method f k v = print k end) map

(* 18 *) let printer (type b) (map : b GMap.t) = GMap.obj_iter
            (object
              method f (type a) (k : a GMap.key) (v : (a,b) GMap.value) = print k
            end) map

(* 19 *) let printer (type b) (map : b GMap.t) =
            let f : type a. a GMap.key -> (a,b) GMap.value -> unit = fun k _ -> GADT.print k in
            GMap.obj_iter (object method f = f end) map
4 Likes

There is definitively something interesting going on, but note that the root of the problem is when defining:

module Mk(X: sig type 'x t end) = struct
  type 'a alias = 'a X.t
  type any = { f: 'x. 'x alias }
end
module X = struct
  type 'x t  = int
end
module M = Mk(X)
open M
let ok (f: int ) = { f }
let fail f = { f }

The typechecker fails to recognize that for any type x, x M.alias is always equivalent to int:

Error: This field value has type 'a M.alias which is less general than
        'x. 'x M.alias

(I have not seen the other error message that you mentioned.)
Typically, your successful annotations force to typechecker to expand the alias abbreviation in time.

And the issue is quite brittle: inlining the functor fixes the issue, removing the alias type fixes it too.
Thus certainly, a question worth investigating.

Concerning, the map note, that there is possibly lighter workaround:


  module MapParam = struct
    type 'a key = 'a GADT.t
    type ('a, 'b) value = V of 'a [@@unboxed]
  end

  module GMap = GenericMap(MapParam)
  let polyiter = { GMap.f = (fun x _ -> GADT.print x) }

which has the advantage of making the conversion function a no-op.

3 Likes

Super weird! Thanks for sharing :slight_smile:

For problem 2, the typechecker appears to be confused by type ('a, 'b) value = 'b and I get weird unifications looking like ('a, ('a, _) value) value which seems to imply that the 'b was accidentally made dependent on 'a (?). Building on your opaque type solution, it indeed works if only the type ('a, 'b) value is abstract (but no need for 'a key to be opaque as suggested by @octachron).

Somehow, it looks like we can help the typechecker see the lack of dependency by defining:

type ('a, 'b) kv = Kv : 'a key * ('a, 'b) value -> ('a, 'b) kv

type 'b polyiter = { f : 'a. ('a, 'b) kv -> unit }

… But it’s probably super brittle, since adding the following type annotation also fixes the issue:

module GMap
: module type of GenericMap (MapParam)
  with type ('a, 'b) value = ('a, 'b) MapParam.value
= GenericMap (MapParam)

let polyiter = { GMap.f = fun k _ -> GADT.print k }

(erh what?!)

A more minimal example for problem 2 too:

module Alias(X: sig type ('a, 'b) snd end) = struct
  type ('a, 'b) snd = ('a, 'b) X.snd
 end

 module M = struct
  type ('a, 'b) snd = 'b
 end

 module A = Alias(M)
 open A
 type 'b f = { f : 'a.  ('a, 'b) snd -> unit  }

 let conv (g: int -> unit ) = { f = g }

Error: This field value has type ('b, ('c, int) A.snd) A.snd -> unit
      which is less general than 'a. ('a, 'd) A.snd -> unit

(and the error message changes with -principal which is a rarely a good sign).

Thanks for these insights! Using type ('a, 'b) value = V of 'a [@@unboxed] for problem 2 is the best workaround I’ve seen so I’ll be using it.

Building on the minimal example: you can have the third error message when adding methods

module Mk(X: sig type 'a t end) = struct type 'a alias = 'a X.t end
module X = struct type 'a t = int end
module M = Mk(X)
open M

type any = { f: 'a. 'a alias }
let ok ( f: int ) = { f }
let fail f = { f }

type any_obj = < f: 'a. 'a alias >
let ok' ( f: int ) : any_obj = object method f = f end
let fail' f : any_obj = object method f = f end (* HERE *)

module Mk'(X': sig type ('a, 'b) snd end) = struct type ('a, 'b) snd = ('a, 'b) X'.snd end
module X' = struct type ('a, 'b) snd = 'b end
module M' = Mk'(X')
open M'

type 'b any' = { f : 'a. ('a, 'b) snd }
let fail2 (g: int) : int any' = { f = g }
let fail3 : int any' = { f = 2 }

type 'b any_obj' = < f: 'a. ('a, 'b) snd >
let ok'2 (g : int) : _ any_obj' = object method f = g end
let fail'2 g : int any_obj' = object method f = g end  (* HERE *)
let ok'3 : int any_obj' = object method f = 2 end
let fail'3 : _ any_obj' = object method f = 2 end  (* HERE *)

(* If the method also depends on 'a, even annotations don't help. *)
type 'b any_obj'2 = < f: 'a. 'a list -> ('a, 'b) snd >
let fail'4 (g : 'a list -> int) : int any_obj'2 = object method f = g end  (* HERE *)
let fail'5 : int any_obj'2 = object method f = List.length end  (* HERE *)

In the object case, the error is:

Error: This expression has type < f : 'b >
       but an expression was expected of type
         any_obj = < f : 'a. 'a M.alias >
       Type 'b is not compatible with type 'a M.alias = int
       The universal variable 'a would escape its scope

which is signaling that an ordinary type variable cannot be unified with the type scheme 'a. 'a M.alias (which is true for a non-bivariant M.alias) which is also an issue with no expanding 'a . 'a M.alias to discover that 'a. 'a M.alias is int.