Prohibiting invalid states with GADT - the airlock exercise

Just a fun example I read about. Imagine you have an airlock with two doors, one inner and one outer. Of course, the state to have both doors open is not allowed. How would you represent this with GADT, in a way that you can easily query the state of a single door, like get_inner_door_state st?

Next challenge, you also add air pressure to the airlock, which can be either pressure or no pressure. No pressure is only allowed when the outer door is open.

The interesting topic is how many changes are needed in the code when adding a new state.

https://dev.realworldocaml.org/gadts.html

2 Likes

I didn’t try with GADTs but I did make an attempt with phantom types as they are useful for representing a state machine:

module Airlock : sig
  type outter
  type inner
  type closed
  type 'a t

  val create : unit -> closed t
  val closed : 'a t -> closed t
  val open_outter : closed t -> outter t
  val open_inner : closed t -> inner t
end = struct
  type outter
  type inner
  type closed
  type 'a t = unit

  let create () = ()
  let closed () = ()
  let open_outter () = ()
  let open_inner () = ()
end

let () =
  let airlock = Airlock.create () in
  let airlock = Airlock.open_outter airlock in
  let airlock = Airlock.open_inner airlock in
  ()

This doesn’t compile, open_inner requires a closed airlock. The downside to this is that in reality your airlock is the same “airlock” between operations and would probably be mutable, so creating a value of a new type doesn’t work.

1 Like

I realized this when lying in bed! That phantom types might be a better tool. Do you know how to wrap it in the state monad, so that you don’t have to pass around the airlock state explicitly? Using let* or such.

1 Like

Another version of same concept, but splitting each door into its own type variable (to more easily add the air pressure state later):

module Airlock : sig
  type opened
  type closed
  type ('inner, 'outer) t

  val create : unit -> (closed, closed) t
  val open_inner  : (closed, closed) t -> (opened, closed) t
  val close_inner : (opened, 'b)     t -> (closed, 'b) t
  val open_outer  : (closed, closed) t -> (opened, opened) t
  val close_outer : ('a, opened)     t -> ('a, closed) t
end = struct
  type opened
  type closed
  type ('inner, 'outer) t = unit

  let create () = ()
  let open_inner  t = t
  let close_inner t = t
  let open_outer  t = t
  let close_outer t = t
end

let () =
  let airlock = Airlock.create () in
  let airlock = Airlock.open_inner airlock in
  let airlock = Airlock.close_inner airlock in
  let airlock = Airlock.open_outer airlock in
  ()
1 Like

Same solution with GADTs:

type opened
type closed

type (_,_) state =
    | Create_state : unit -> (closed, closed) state
    | Open_inner : (closed, closed) state -> (opened, closed) state
    | Open_outer : (closed, closed) state -> (closed, opened) state

let rec eval_state : type a b. (a, b) state -> unit = function
    | Create_state () -> print_endline "Creating state"
    | Open_inner s ->
        print_endline "Opening inner door";
        eval_state s
    | Open_outer s ->
        print_endline "Opening outer door";
        eval_state s

let _ =
    let s = Open_outer (Create_state ()) in
    eval_state s
1 Like

Both solutions are having the same problem regarding scalability: You can’t add a new state variable, like an extra door or air pressure, without touching every function in the state machine. :frowning:

1 Like

Wait, how would I add

  val print_inner_door : ('a, 'b) t -> unit

to this solution, to print out the state of the inner door? I can’t match on the phantom type :stuck_out_tongue:

1 Like

The other problem, which perhaps doesn’t matter for the exercise, is if you’re trying to build an actual controller of a physical airlock, making a new value on every operation doesn’t work unless you can guarantee some uniqueness of the value. Some piece of code could always hold on to a previous value and then call open on it.

I think it’s an interesting type-level exercise, I just don’t want to see someone giving a talk about how the Ocaml type system means you can write more secure airlocks :slight_smile:

2 Likes

Yeah, you’d need some kind of affine types, or use-only-once, that kind of thing. Hiding the state in a monadic bind is another alternative, but of course it does not enforce correct usage.

2 Likes

Hm, anyone knows why this code doesn’t work?

module Airlock : sig
  type opened
  type closed
  type ('inner, 'outer) t

  val create : unit -> (closed, closed) t
  val print_inner_door : ('a, 'b) t -> unit
end = struct
  type opened = Opened
  type closed = Closed
  type ('inner, 'outer) t = State of 'inner * 'outer

  let create () = State (Closed, Closed)
  let print_inner_door : ('a, 'b) t -> unit = function
    | State (Closed, _) -> print_endline "Closed"
    | State (Opened, _) -> print_endline "Opened"
end

Error:

39 |     | State (Opened, _) -> print_endline "Opened"
                  ^^^^^^
Error: This variant pattern is expected to have type closed
       The constructor Opened does not belong to type closed

Also error when trying

  let print_inner_door : type a b. (a, b) t -> unit = function

fails with

38 |     | State (Closed, _) -> print_endline "Closed"
                  ^^^^^^
Error: This pattern matches values of type closed
       but a pattern was expected which matches values of type a

:thinking: Is this an inherit limit of polymorphism?

1 Like

In this version, you are not using GADTs but ordinary variants. And with ordinary variants, a polymorphic function of type 'a 'b. ('a,'b) t -> ... must work for any types 'a and 'b which is only possible if the function handles the inner values of State(x,y) as black boxes. Similarly, a function of type 'b. (closed, 'b) t -> ... requires that x is a value of type closed.

1 Like

Slight rewrite and it works:

module Airlock : sig
  type opened
  type closed
  type 'a door_state
  type ('inner, 'outer) t

  val create : unit -> (closed door_state, closed door_state) t
  val open_inner  : (closed door_state, closed door_state) t -> (opened door_state, closed door_state) t
  val print_inner : ('a door_state, 'b door_state) t -> unit
end = struct
  type opened = Opened_tag
  type closed = Closed_tag
  type 'a door_state =
      | Opened
      | Closed

  type ('inner, 'outer) t = State of 'inner * 'outer

  let create () = State (Closed, Closed)
  let open_inner t = State (Opened, Closed)
  let print_inner : ('a door_state, 'b door_state) t -> unit = function
    | State (Opened, _) -> print_endline "Opened"
    | State (Closed, _) -> print_endline "Closed"
end

let () =
  let open Airlock in
  create ()  |>
  open_inner |>
  print_inner

Opened

Bit awkward code, but you do what you gotta do. :slight_smile:

Would someone like to show how to replace the pipe here with let*? Would that make sense?

What would it mean to map or bind on an airlock? Would it be an object inside the airlock that is mapped? I would expect map to only work in a certain configuration of doors - but then the interface 'a t -> ('a -> 'b) -> 'b t isn’t fulfilled, as that allows one to always map.

Gasche was kind enough to update his 9 year old answer to my old question; see here: Linear types in OCaml - Stack Overflow

The question is using file handler with open/close state, but same principle applies to the airlock. :slight_smile:

Excerpt:

(* starting with OCaml 4.13, you can use binding operators:
   ( let* ) instead of ( >>= ) *)
let test =
  let open File in
  let ( let* ) = bind in
  run begin
    let* () = open_ "/tmp/foo" in
    let* content = read in
    print_endline ("[user] read " ^ content);
    close
  end

Cool. Concerning my earlier comment - I didn’t know that monadic bind was allowed to have an extra type-parameter? (i.e. I thought bind should always have type 'a M.t -> ('a -> 'b M.t) -> 'b M.t) If the type of the second type-parameter changes, is it then another monad?

In the specific example @gasche wrote, it’s possible to open_ several times, and just call close once (I know he writes it’s not fleshed out).

Playing with how the interface could be better, it seems it could be nice to:

  • represent each of the opened files as their own type-parameter
  • only allow run when all type-parameters are close_st
  • have several open_<N> procedures to open each of the possible type-parameters
  • not allow open_1 if that type-param is already opened
  • to close_<N> the correct fd<N> - fd<N> would need to bear the type-info too.

Usage would look something like:

let test =
  let open File in
  let ( let* ) = bind in
  run begin
    let* fd1 = open_1 "/tmp/foo" in
    let* fd2 = open_2 "/tmp/bar" in
    let* fd3 = open_3 "/tmp/baz" in
    let* content1 = read fd1 in
    ...
    let* () = close_1 fd1 in
    let* fd1 = open_1 "/tmp/another_after_close" in
    ...
    let* () = close_1 fd1 in
    let* () = close_2 fd2 in
    close_3 fd3
  end
1 Like

Is it? I guess it depends on the type you consider as the return type of the run function. But if you restrict yourselves to examples that return a base type, for example unit or int (in particular, not st itself), then I think (but I may miss something) that the types force you to close everything you open.

I did something about that a long time ago: sage, an agnostic monad to handle file which gives you the opportunity to prove that you really closed a file. I did not release it but I suspect that the library pretty good (but I was not sure that it was so useful).

1 Like

Hmm, I don’t know how you then intended the interface to be used, but this is allowed:

let test =
  let open File in
  let ( let* ) = bind in
  run begin
    let* () = open_ "/tmp/foo" in
    let* content = read in
    print_endline ("[user] read " ^ content);
    let* () = open_ "/tmp/bar" in
    let* () = open_ "/tmp/baz" in
    close
  end

Nice, but now I have questions (:

  • Why is the Sage.Refl thing needed? (I mean - why runtime check - would be nice with static check)
  • Does one use Sage.both to open several files? - how does that code look?
  • And why use object types - do you use the row polymorphism? EDIT: Ah - instead of two type parameters for respectively read and write, you have two fields in the object type, rd and wr