Unable to refute impossible GADT pattern with polymorphic variants

I’ve been playing around with the idea of modeling FSMs with GADTs and polymorphic variant constraints.

The basic idea looks as follows:

type state = [ `foo | `bar ]
type ('from, 'to_) event =
  | First :
      ([ `initial ], [ `foo ]) event
  | Second : ([ `foo ], [ `bar ]) event
  | Third : ([ `bar ], [ `terminal ]) event
type _ step =
  | I : [ `initial ] step
  | S : 'a step * ('a, [< state ] as 'b) event -> 'b step
  | T : 'a step * ('a, [ `terminal ] as 'b) event -> 'b step

type t = M : _ step -> t

Here, state represents a possible state of the FSM, an event is what allows for transitions between states, and step is the glue that holds it all together.

Ideally, this looks something like:

let _ = M (T (S (S (I, First), Second), Third))

which is a transition from the initial state to the terminal one.

This works pretty well actually, except when I’ve been pattern matching over t. The compiler (rightly) does not allow something like this to be constructed

let _ = M (T (S (I, First), Third))

It gives the error

Error: This expression has type ([ `bar ], [ `terminal ]) event
but an expression was expected of type ([ `foo ], [ `terminal ]) event
These two variant types have no intersection

However, when pattern matching over values of t, the compiler refuses to refute such impossible cases:

let f = function
  | M (S (_, First)) -> ()
  | M (S (_, Second)) -> ()
  | M I -> ()
  | M (T (S (_, Second), Third)) -> ()
  | M (T (S (_, First), Third)) -> .

It gives the error

Error: This match case could not be refuted.
Here is an example of a value that would reach it:
M (T (S (I, First), Third))

Am I doing something wrong here or is this just a limitation of using polymorphic variant constraints with GADTs?

1 Like

I’ve boiled this down to a smaller set of types and have found a “fix”.

Here is a minimal reproduction of the above “bug”:

type _ b =
  | X : [ `x ] b
  | Y : [ `y ] b
type t =
  | T : 'a b * ([< `x | `y ] as 'a) -> t

let f = function 
  | T (X, `x) -> ()
  | T (Y, `y) -> ()
let _ = T (X, `y)

The compiler complains both about f being non-exhaustive and the last line is non-constructable.

The fix is simply to swap the order of the arguments to the constructor in t:

type t' =
  | T : ([< `x | `y ] as 'a) * 'a b -> t

My hypothesis is that this has to do with the order in which the types are inferred, which seems to be left-to-right. In type t, 'a is open in 'a b and then constrained to [< `x | `y ] in the second argument. If the inference goes left-to-right, that would mean any combination of 'a b and [ `x | `y ] is possible.

However in t', the first argument constrains 'a which narrows the possible 'a bs that can appear in the second position.

This is just a hypothesis, however. I’m not sure if this is actually what’s happening.

Matching on a GADT adds type equalities to the typing environment. Those equalities can be used in the right-hand-side of the pattern, and also “to the right” of the pattern where they were introduced. You are right that this is a left-to-right bias in the type-checker, but note that this is not related to type inference per se (guessing types), it is part of the typing rules that even a fully-explicit version of OCaml would respect. (It is of course possible to consider different typing rules that allow to type more programs.)


Ah thanks @gasche for the clarification – I was definitely misusing the phrase “inference” here to mean the introduction/reconciliation of type equalities.

it is part of the typing rules that even a fully-explicit version of OCaml would respect.

This makes sense to me which is why I held off on calling the behavior a “bug” in the type system. As someone who knows very little about formal type theory, would it (in theory) be possible to detect the possibility of something like this at the type definition itself and emit a warning? I recognize this sort of thing manifests itself pretty rarely but it was pretty confusing to me at first.

First, as a general remark, the interaction of row variables and GADTs is not well specified. Thus it is common to hit hard to understand behaviour, and nothing is guaranteed beyond the fact that the currently implemented interaction is safe.
The issue is that GADTs work by adding equations, and those equations interact poorly with polymorphic variant constraints. In particular, GADT equations cannot narrow a polymorphic variant constraints.
Thus, when you write

| T (X, _ ) -> 

the type system doesn’t know that it can narrow the equation x < [ 'x | `y ] to x = [ `x ]

Second, your last example doesn’t work? Switching the order of the type definition doesn’t change anything as far as I can see. Did you forget to switch the arguments of the constructor?

Fortunately, your initial idea can be implemented more naturally with object types (as type-level record):

(** Let define our type-level values first *)
type yes = [`yes]
type no = [ `no ]
type boolean = [ yes | no ]
type tags = [ `foo | `bar | `initial | `terminal ]

(** The abbreviation state is only useful to catch typos *)
type 'a state = < 
  tag: [< tags ] ; 
  initial: [< boolean ]; 
  terminal: [< boolean ]
> as 'a

type initial = < tag: [`initial]; initial:yes; terminal:no > state             
type foo = < tag: [`foo]; initial:no; terminal: no > state
type bar = < tag : [`bar]; initial:no; terminal: no > state
type terminal = <tag: [`terminal]; initial:yes; terminal:yes > state

type (_,_) event =
  | First: (initial, foo) event
  | Second: (foo, bar) event
  | Third: (bar, terminal) event

type step = 
| I: initial step
| S: 'a step * ('a,'b) event -> 'b step  
| T: 
  'a step * ('a, <terminal:yes; tags: 't; initial:'i > as 'b) event -> 
  'b step

Then we can check that the typechecker has no problem checking that we only have four possible paths

let f: type a. a step -> unit = function
  | I -> ()
  | S (I, First) -> ()
  | S (S(I,First), Second) -> ()
  | T (S(S(I,First), Second), Third) -> ()
  | S (S(S(I,First), Second), Third) -> 
     () (* Was that history supposed to be allowed ?*)
  | _ -> .