Do I need polymorphic variants?


#1

Hi, I’m learning OCaml and I wanted to model some sort of finite state machine with static checking of possible state transitions.
I started with variants, trying to code it like this:

type stopwatch =
  | Running of int
  | Paused of int

let start = Running 0
let tick (Running s) = Running (s + 1)
let pause (Running s) = Paused s
let resume (Paused s) = Running s

let sw = start |> tick |> pause |> tick

I expect that last line will fail with compiler error because you shouldn’t be able to tick on paused stopwatch. But it compiles without errors.

Then I discovered polymorphic variants and used it instead:

type stopwatch = [
  | `Running of int
  | `Paused of int
]

let start = `Running 0
let tick (`Running s) = `Running (s + 1)
let pause (`Running s) = `Paused s
let resume (`Paused s) = `Running s

let sw = start |> tick |> pause |> tick

This time compiler works as I expected:

Error: This expression has type [< `Running of int ] -> [> `Running of int ]
       but an expression was expected of type [> `Paused of int ] -> 'a
       The first variant type does not allow tag(s) `Paused

Is it ok to use polymorphic variants here or do I need to do something more with the first example?


#2

Polymorphic variants look like a very reasonable choice. You could also use GADTs, although they are overkill for this simple problem.

Plain data types provide no way to refine a type to indicate “this case can’t happen”, so if you want more precise type checking you have to pick one of those two.


#3

You shall use polymorphic variants when you need to express an inclusion relation between one set of variants and another set of variants, i.e., when you need subtyping. Or you can use them if you’re lazy and don’t want to prefix constructor names with module names :slight_smile:

In your case, you do not need to use them. Nor do you need normal variants. In fact, your usage of polymorphic variants highlights this, as ['Running of int ] and ['Paused of int ] are treated as different types, not members of the same type.

Indeed, if you do not want Running x and Paused x to be unified and treated as values of the same type, then just don’t make them so, e.g.,

type running = Running of int
type paused  = Paused of int

let start = Running 0
let tick (Running s) = Running (s + 1)
let pause (Running s) = Paused s
let resume (Paused s) = Running s

let sw = start |> tick |> pause |> tick

The construction type x = Make of int defines a new type, and a constructor Make : int -> x that creates values of the new type x from the values of type int. Not to confuse with the type x = int that introduces a type abbreviation, but since abbreviation could be longer the original type, it is better to call it “alias”.

Another way to introduce a new type in OCaml is to use records:

type running = {running : int}
type paused  = {paused : int}

let start = {running=0}
let tick {running=s} = {running = s + 1}
let pause {running=s} = {paused=s}
let resume {paused=s} = {running=s}

let sw = start |> tick |> pause |> tick

P.S. Also, it’s worthwhile to mention, that your original OCaml source code is not truly legit, as you were using irrefutable pattern matching, i.e., patterns that may fail at runtime with the Match_failure exception. The compiler and toplevel will emit a warning about this, and most of the OCaml programmers treat this warning as an error.


#4

As implied by @ivg, it really depends on the set of operations you have to
implement and whether there is some kind of overlapping between the states.

For instance, your encoding does not seem to work if you have an operation
that preserves the running/paused mode but resets the counter. You would
probably write:

let reset = function
  | `Running s -> `Running 0
  | `Paused s -> `Paused 0

but then, the following “legal” chain of transitions would be rejected:

let sw = start |> tick |> pause |> reset |> resume |> tick

with:

Error: This expression has type [< `Paused of 'a ] -> [> `Running of 'a ]
       but an expression was expected of type
         [> `Paused of int | `Running of int ] -> 'b
       The first variant type does not allow tag(s) `Running

because the type of reset is:

[< `Paused of 'a | `Running of 'b ] -> [> `Paused of int | `Running of int ]

which means that you “lose” the running/paused mode when going through
this function.

If you have to implement such an operation, you can switch to GADTs, as
suggested by @gsg:

type running
type paused
type _ stopwatch =
  | Running : int -> running stopwatch
  | Paused  : int -> paused  stopwatch
let start = Running 0
let tick = function
  | Running s -> Running (s + 1)
let pause = function
  | Running s -> Paused s
let resume = function
  | Paused s -> Running s
let reset : type a . a stopwatch -> a stopwatch = function
  | Running s -> Running 0
  | Paused s -> Paused 0

(Notice that you could provide the same interface with no GADTs, at the
expense of “unsafe” function implementations.)


#5

This is a great example of using polymorphic variants for doing state transitions in a type-safe way http://camltastic.blogspot.ca/2008/05/phantom-types.html


#6

Indeed, but if you use phantom types + polymorphic variants and want to provide
an operation such as reset, your implementation will be “unsafe”:

module P : sig
  type 'a stopwatch
  val start : [`Running] stopwatch
  val tick : [`Running] stopwatch -> [`Running] stopwatch
  val pause : [`Running] stopwatch -> [`Paused] stopwatch
  val resume : [`Paused] stopwatch -> [`Running] stopwatch
  val reset : 'a stopwatch -> 'a stopwatch
end = struct
  type 'a stopwatch =
    | Running of int
    | Paused  of int
  let start = Running 0
  let tick = function
    | Running s -> Running (s + 1)
    | _ -> assert false
  let pause = function
    | Running s -> Paused s
    | _ -> assert false
  let resume = function
    | Paused s -> Running s
    | _ -> assert false
  let reset : type a . a stopwatch -> a stopwatch = function
    | Running s -> Running 0
    | Paused s -> Paused 0
end

where the assert false are really impossible cases given the interface.
With GADTs, you do not need to write such cases.


#7

Finally, you can combine both schools of magic and use polymorphic types as phantoms annotating GADTS, e.g.,


#8

Using many features (objects, polymorphic variants, recursive types, GADT) and some known tricks I ended up with a way to encode deterministic finite state machine with heterogenously typed states (so that only allowed transitions are accepted by the type checker), but it’s a bit heavy (just an example for 3 states, 3 operations and 7 allowed transitions):

type state0 =
  < trans0 : state1;
    trans1 : state2;
    trans2 : [`forbidden];
    id     : [`_0] >
and state1 =
  < trans0 : state2;
    trans1 : state0;
    trans2 : state2;
    id     : [`_1] >
and state2 =
  < trans0 : state0;
    trans1 : [`forbidden];
    trans2 : state2;
    id     : [`_2] >

type data0 =
  {
    field0 : int;
    field1 : string
  }

type data1 = A | B of int

type data2 = int list option

type ('state, 'data) estate =
  | State0 : (state0, data0) estate
  | State1 : (state1, data1) estate
  | State2 : (state2, data2) estate

type 'state fsm_state = Ex : ('state, 'data) estate * 'data -> 'state fsm_state

let trans0 : type s0 s1 s2 id. < trans0 : s0; trans1 : s1; trans2 : s2; id : id > fsm_state -> s0 fsm_state =
  function
  | Ex (State0, { field0; field1 }) -> Ex (State1, if field0 < 0 && field1 = "" then A else B field0)
  | Ex (State1, A)                  -> Ex (State2, None)
  | Ex (State1 ,B i)                -> Ex (State2, Some [i])
  | Ex (State2, Some (x :: xs))     -> Ex (State0, { field0 = x; field1 = "" })
  | Ex (State2, (None | Some []))   -> Ex (State0, { field0 = 0; field1 = "" })

let trans1 :
  type s0 s10 s11 s12 s1i s2 id.
  < trans0 : s0; 
    trans1 : < trans0 : s10; trans1 : s11; trans2 : s12; id : s1i > as 'u;
    trans2 : s2;
    id : id > fsm_state -> 'u fsm_state =
  function
  | Ex (State0, { field0; field1 }) -> Ex (State2, if field0 < 0 && field1 = "" then Some [field0] else None)
  | Ex (State1, A)                  -> Ex (State0, { field0 = 0; field1 = "" })
  | Ex (State1 ,B i)                -> Ex (State0, { field0 = i; field1 = string_of_int i})

let rec trans2 :
  type s0 s1 s20 s21 s22 s2i id.
  < trans0 : s0;
    trans1 : s1;
    trans2 : < trans0 : s20; trans1 : s21; trans2 : s22; id : s2i > as 'u;
    id : id > fsm_state -> 'u fsm_state =
  function
  | Ex (State1, A)                  -> trans2 @@ Ex (State1, B 0)
  | Ex (State1, B i)                -> Ex (State2, None)
  | Ex (State2, None)               -> Ex (State2, Some [])
  | Ex (State2, Some [])            -> Ex (State2, None)
  | Ex (State2, Some (x :: xs))     -> trans2 @@ Ex (State2, Some xs)

let r0 = Ex (State0, { field0 = 1; field1 = ""}) |> trans1 |> trans0
let r1 = Ex (State0, { field0 = 1; field1 = ""}) |> trans1 |> trans0 |> trans0
let r2 = Ex (State0, { field0 = 1; field1 = ""}) |> trans1 |> trans0 |> trans0 |> trans2

(* let re = Ex (State0, { field0 = 1; field1 = ""}) |> trans1 |> trans1 *) (* Type error *)


#9

The problem with polymorphic variants is that your code will start to be hard to maintain.
Better list and name all the types you need right now.
Only if used very locally have I seen then useful.
For example:

val BatList.range int -> [< `Downto | `To ] -> int -> int list                
    range 1 `To 3 = [1; 2; 3]
    range 3 `Downto 1 = [3; 2; 1]

#10

Why not just

module P : sig
  type 'a stopwatch
  val start : [`Running] stopwatch
  val tick : [`Running] stopwatch -> [`Running] stopwatch
  val pause : [`Running] stopwatch -> [`Paused] stopwatch
  val resume : [`Paused] stopwatch -> [`Running] stopwatch
  val reset : 'a stopwatch -> [`Running] stopwatch
end = struct
  type 'a stopwatch = int
  let start = 0
  let tick = succ
  let pause s = s
  let resume = pause
  let reset _ = start
end

?

The module signature ensures you can’t do, e.g., start |> pause |> tick


#11

A potential problem is that you cannot inspect the running/stop mode
at running. Another one, common to the use of phantom types without
GADTs is that the compiler will check almost nothing regarding your
implementation. The GADT version(s), in contrast, check that e.g.
pause only accepts running watches and always returns stopped ones.

The original poster might be interested to have a look at the GADT+phantoms
implementation suggested by @ivg; here it is:

module GADT = struct
  type _ stopwatch =
    | Running : int -> [`Running] stopwatch
    | Paused  : int -> [`Paused]  stopwatch
  let start = Running 0
  let tick = function
    | Running s -> Running (s + 1)
  let pause = function
    | Running s -> Paused s
  let resume = function
    | Paused s -> Running s
  let reset : type a . a stopwatch -> a stopwatch = function
    | Running s -> Running 0
    | Paused s -> Paused 0
  let sw = start |> tick |> pause |> reset |> resume |> tick
end

#12

Thx for the all responses. I have few new things to learn now.
I really like GADTs examples by @xclerc and I’ll try to use it in my code.


#13

This is nice, though I am bit not clear on how one can use this GADT type with other stdlib data structures. For example I tried to do the following,
let l = [Running 0;Paused 10];;
But got an error,

Error: This expression has type paused stopwatch
       but an expression was expected of type running stopwatch
       Type paused is not compatible with type running

Could we somehow perhaps make the stdlib list more generalized? Is there such a notation/mechanism?


#14

Rather than generalize lists, it may be more interesting to introduce another type for a stopwatch in any state, and then create lists of that type:

type wrapper = Stopwatch : 'a stopwatch -> wrapper

let l = [ Stopwatch (Running 0); Stopwatch (Paused 10) ];