Lwt core refactored and documented to be contributor-friendly

antron · July 2, 2017, 10:31pm

I’m not sure what format or how much information you would like (let me know), but here’s a summary. I’m simplifying, and omitting irrelevant details. If you want the “real deal” on the data types discussed below, see the actual lwt.ml Also, if you want a more general and simpler discussion of GADTs, let me (and the community) know. The discussion below is maybe too focused on the actual types in Lwt, which could be a source of noise.

Using regular variants

An Lwt promise is basically

type 'a promise =
  {mutable state : 'a promise_state}

and 'a promise_state =   (* a promise can... *)
  | Resolved of 'a       (* resolve with a value *)
  | Failed of exn        (* fail with an exception *)
  | Pending of ('a promise_state -> unit) list
                         (* still be pending, with a list of callbacks
                            to call when it completes *)

Those callbacks are used to implement things like bind, join, etc. They are always called exactly when a promise stops being Pending, i.e. when the state becomes Resolved or Failed. Therefore, those are the only two possible constructors that can be passed to the callbacks.

The problem

However, as you can see, the type promise_state has that third constructor Pending. So, lwt.ml was littered with code like this:

let bind p f =
  (* ... *)

  (* Ok, we figured out that `p` is pending, so let's add a callback to
     it, to do something when it's not pending anymore. *)

  let callback =
    fun p's_new_state ->
      match p's_new_state with
      | Resolved v -> (* do something with `f v` *)
      | Failed exn -> (* do something with `exn` *)
      | Pending _ -> assert false
        (* This case is impossible, but the type system doesn't know! *)
  in
  add_callback callback p;

  (* ... *)

This is a simple example, but the large number of undocumented assert false, especially in rather esoteric helpers, was a source of anxiety. For each one:

It’s clear that the case shouldn’t happen, but not why not.
The error is a runtime error, so one is never sure if some crazy scenario can trigger it, and that the tests are thorough enough.

Why not use `result`?

One way to resolve this is to pass ('a, exn) result to callbacks, instead of 'a promise_state. However, this has two drawbacks:

It requires allocating an ('a, exn) result every time a promise completes. We would like to reuse the information already carried in the promise_state. We could solve that by having a case
```
| Completed of ('a, exn) result
```
instead of separate Resolved and Failed, i.e. the Async way, but that’s still two allocations per promise, and it only seems plausible because I am simplifying – see next point…
This only solves the problem for callbacks. There are other places where we want the type system to know the promise is only Pending, or satisfies some other condition. Those also have assert false expressions.

The GADT way

We can instead define

type completed
type pending
(* There are no values of these types. Their only purpose is for the
   compiler to think that they are different types from each other. *)

type (_, _) promise_state =
  | Resolved : 'a  -> ('a, completed) promise_state
  | Failed   : exn -> ('a, completed) promise_state
  | Pending  :
    (('a, completed) promise_state -> unit) list
                   -> ('a, pending)   promise_state

Ok, this is much more verbose, but the only material differences here are:

We tell the compiler what kind of promise_state each constructor produces. Instead of everything being just one kind of 'a promise_state, there is now ('a, completed) promise_state and ('a, pending) promise_state.
We tell the compiler that callbacks take only the completed variety of promise state.

Now, because the compiler knows that callbacks don’t take a pending state (that’s why we need that type pending to be different from completed), we can do this:

let callback =
  fun p's_new_state ->
    match p's_new_state with
    | Resolved v -> (* ... `f v` ... *)
    | Failed exn -> (* ... something with exn ... *)
in

and simply drop the Pending case, because, again, the argument p's_new_state has type ('a, completed) promise_state in callbacks. And, in the place where we call these callbacks in the implementation, the compiler does not allow us to pass a promise_state that even might be pending!

The representation of this GADT in memory, at runtime, is the same as of the original variant shown above. But through the type system, we know that at this point only the first two cases are possible.

In fact, not only can we write the callback that way, but we have to. The compiler will yell cryptic errors at us if we accidentally add a Pending case!

Of course, the actual lwt.ml has more complex scenarios than this. I probably wouldn’t have used a GADT if this was the only thing we wanted to push into the type system. It would be overkill. If you want to see what else the GADTs are used for, take a look in the file. Hopefully, it’s documented clearly enough; otherwise, questions are welcome As a note, the other major use of GADTs in the new lwt.ml is so-called existential types. That has little to do with eliminating assert false, but I can explain that separately, if desired.

Topic		Replies	Views
Lwt not sequencing correctly? Community lwt	15	932	June 23, 2021
Do I need polymorphic variants? Learning	18	5242	January 6, 2023
New Draft Tutorial on Polymorphic Variants Learning announce , ocamlorg , polymorphic-variants , learn-ocaml , tutorial	24	1656	November 30, 2023
[ANN] Lwt.6.0.0~alpha (direct-style) Ecosystem lwt , effects , monad	31	1523	September 19, 2025
Is it feasible to write parsers without using polymorphic variants for AST representation? Learning	16	2190	April 25, 2018

Lwt core refactored and documented to be contributor-friendly

Using regular variants

The problem

Why not use result?

The GADT way

Related topics

Why not use `result`?