I came up with an example that demonstrates my understanding of the issue. The goal is to construct an example where always choosing either of the two possible solutions for the unification problem makes the program ill-typed (so we can see there is no most general solution).
First, let’s introduce a higher-kinded function. Since OCaml doesn’t have first-class higher-kinded polymorphism, we use a functor (the “heavyweight” higher-kinded polymorphism):
module Fold2 (T1 : sig type 'a t end) (T2 : sig type 'a t end) : sig
val f : (('a -> 'b -> 'b) -> 'c T1.t -> 'b -> 'b) ->
(('a -> 'b -> 'b) -> 'c T2.t -> 'b -> 'b) ->
('a -> 'a -> 'b -> 'b) ->
'c T1.t -> 'c T2.t -> 'b -> 'b
end = struct
let f f1 f2 f c1 c2 = f1 (fun e1 -> f2 (f e1) c2) c1
end
Basically it’s a fold over cross-product of the elements of two containers, but the type parameters of the container types ('c
), though are equal, don’t necessarily correspond to the element type (although are same in both containers).
Now let’s introduce some useful type that makes use of this relaxed constraint. We choose it to be a synonym that is seen as an abstract type from the outside of the module so that the parameter becomes phantom and expresses some additional invariant (constraint) on the state of the container (rather than expressing the type of the element). For example, we can encode emptiness of a set of integers so that
empty set + empty set = empty set
empty set + nonempty set = nonempty set
nonempty set + nonempty set = nonempty set
empty set x empty set = empty set
empty set x nonempty set = empty set
nonempty set x nonempty set = nonempty set
where x
is the Cartesian product of the sets:
type (+'s0, +'p0) empty =
<
name : [`empty];
plus_empty : ('s0, 'p0) empty;
plus_nonempty : ('s0, 'p0) nonempty;
times_empty : ('s0, 'p0) empty;
times_nonempty : ('s0, 'p0) empty
> * ('s0 * 's1) * ('p0 * 'p1)
constraint 's0 = < plus_empty : 's1; plus_nonempty : _; .. > * _ * _
constraint 'p0 = < times_empty : 'p1; times_nonempty: _; .. > * _ * _
and (+'s0, +'p0) nonempty =
<
name : [`nonempty];
plus_empty : ('s0, 'p0) nonempty;
plus_nonempty : ('s0, 'p0) nonempty;
times_empty : ('s0, 'p0) empty;
times_nonempty : ('s0, 'p0) nonempty;
> * ('s0 * 's1) * ('p0 * 'p1)
constraint 's0 = < plus_nonempty : 's1; plus_empty: _; .. > * _ * _
constraint 'p0 = < times_nonempty :'p1; times_empty: _; .. > * _ * _
module Int_set : sig
type +'a t
val empty : (_ empty) t
val single : int -> (_ nonempty) t
val choose : (_ nonempty) t -> int
val sum : 'a t -> (_ * ('a * 'b) * _) t -> 'b t
val xprod : (int -> int -> int) -> 'a t -> (_ * _ * ('a * 'b)) t -> 'b t
val intersects : 'a t -> 'b t -> bool
val fold : (int -> 'b -> 'b) -> 'a t -> 'b -> 'b
end = struct
module Int_set = Set.Make (struct type t = int let compare : int -> _ = compare let equal : int -> _ = (=) end)
open Int_set
type nonrec +'a t = t
let empty = empty
let single = singleton
let choose = choose
let sum = union
let fold = fold
module F = Fold2 (struct type 'a t = Int_set.t end) (struct type 'a t = Int_set.t end)
let xprod f a b = F.f fold fold (fun e1 e2 -> add @@ f e1 e2) a b empty
let intersects a b = F.f fold fold (fun e1 e2 acc -> acc || e1 = e2) a b false
end
Now let’s try to do the same for lists of integers, but extend the module with three extra “useful” (the example is still somewhat artificial) functions:
val intersects1 : 'a Int_set.t -> 'b t -> bool
val intersects2 : int list -> 'b t -> bool
val intersects3 : int list -> string list -> bool
where the type 'a t
is the type of integer lists with a phantom parameter indicating emptiness:
module Int_list : sig
type +'a t
val empty : (_ empty) t
val single : int -> (_ nonempty) t
val choose : (_ nonempty) t -> int
val sum : 'a t -> (_ * ('a * 'b) * _) t -> 'b t
val xprod : (int -> int -> int) -> 'a t -> (_ * _ * ('a * 'b)) t -> 'b t
val intersects : 'a t -> 'b t -> bool
val fold : (int -> 'b -> 'b) -> 'a t -> 'b -> 'b
val intersects1 : 'a Int_set.t -> 'b t -> bool
val intersects2 : int list -> 'b t -> bool
val intersects3 : int list -> string list -> bool
end = struct
open List
type +'a t = int list
let empty = []
let single a = [a]
let choose = hd
let sum = (@)
let fold = fold_right
module F = Fold2 (struct type 'a t = int list end) (struct type 'a t = int list end)
let xprod f a b = F.f fold fold (fun e1 e2 acc -> f e1 e2 :: acc) a b []
let f e1 e2 acc = acc || e1 = e2
let intersects a b = F.f fold fold f a b false
module F' = Fold2 (struct type 'a t = 'a Int_set.t end) (struct type 'a t = int list end)
module F'' = Fold2 (struct type 'a t = int list end) (struct type 'a t = 'a list end)
let intersects1 : type a b. a Int_set.t -> b t -> bool = fun a b -> F'.f Int_set.fold fold f a b false
let intersects2 a b = F''.f fold fold f a b false
let intersects3 a b = F''.f fold (fun f -> fold @@ fun x -> f @@ int_of_string x) f a b false
end
Here, at last, we can see the problem. Since we used the OCaml’s “heavyweight” functors, we were able to precisely indicate how to perform the higher-order unification in presence of type synonyms (the functor applications resulting in modules F'
and F''
). If we try to change either of the exactly specified solutions (try to exchange int list
<–> 'a list
in the second parameter of the functor applications), we get a type error. You can just try it to see exactly which those errors are in both cases. So this example just tangibly demonstrates that neither of the solutions is more general than the other one.
Now I intentionally used rigid type variables in the definition of intersects1
to expose the problem with Haskell approach as I understand it (rigid type variables allow to simulate Haskell approach to specifying type signatures and avoid possible signature inclusion check failure). Basically, Haskell doesn’t treat type synonyms as first-class types and always expands them until it obtains applications of type constructors (datatypes). Then there’s no ambiguity, but just the second solution is always chosen by definition (no first-class support for type synonyms). So we can expect the same error as if we replace type 'a t = int list
with type 'a t = 'a list
in the second argument of Fold2
for module F'
. I tried to simulate this situation with GHCi (I have less experience with Haskell, so no real implementation of set with a phantom type, just a dummy datatype definition with a phantom parameter) and indeed obtained the following error:
:set -XRankNTypes
:{
let f :: forall t1 t2 a b c. ((a -> b -> b) -> t1 c -> b -> b)
-> ((a -> b -> b) -> t2 c -> b -> b)
-> (a -> a -> b -> b)
-> t1 c -> t2 c -> b -> b
f f1 f2 f c1 c2 = f1 (\e1 -> f2 (f e1) c2) c1
:}
data IntSet a = IntSet [Int]
type T a = [Int]
:{
let foldr :: (Int -> b -> b) -> T a -> b -> b
foldr k = go
where
go [] z = z
go (y:ys) z = y `k` go ys z
:}
:{
let foldrIntSet :: (Int -> b -> b) -> IntSet a -> b -> b
foldrIntSet f (IntSet l) = foldr f l
:}
:{
let intersect :: T a -> T b -> Bool
intersect a b = f foldr foldr (\e1 e2 -> (|| e1 == e2)) a b False
:}
:{
let intersect1 :: IntSet a -> T b -> Bool
intersect1 a b = f foldrIntSet foldr (\e1 e2 -> (|| e1 == e2)) a b False
:}
<interactive>:31:68: error:
• Couldn't match type ‘a’ with ‘Int’
‘a’ is a rigid type variable bound by
the type signature for:
intersect1 :: forall a b. IntSet a -> T b -> Bool
at <interactive>:30:5-41
Expected type: IntSet Int
Actual type: IntSet a
• In the fourth argument of ‘f’, namely ‘a’
In the expression:
f foldrIntSet foldr (\ e1 e2 -> (|| e1 == e2)) a b False
In an equation for ‘intersect1’:
intersect1 a b
= f foldrIntSet foldr (\ e1 e2 -> (|| e1 == e2)) a b False
• Relevant bindings include
a :: IntSet a (bound at <interactive>:31:16)
intersect1 :: IntSet a -> T b -> Bool (bound at <interactive>:31:5)
It looks very similar to the OCaml analogue (when we use rigid type variables and choose the wrong solution in F'
):
This expression has type a Int_set.t
but an expression was expected of type int Int_set.t
Type a is not compatible with type int
Edit: typos, better Haskell formatting