Type Trick: Polymorphic variants for Extensible Modular Messaging

One place I’ve often found myself trying to use Polymorphic variants in has been in implementing modular messages.

The general idea is that the components of your system operate in terms of messages, and update their state upon receiving specific messages.

module Volume = struct 
      (* state of this subsystem *)
      type state = int

      let update msg self = match msg 
          | `Incr -> 
           (* received an incr message from somewhere, don't care where *)
           (* update self *) 
           self + 1 
          | _ -> self (* ignore other messages *)
end

Once we have a collection of modules defined in this way, the goal would be to hook them up all together with an event bus that propagates messages between the systems:

let deliver msg systems =
    List.map (fun (state, system) -> 
             (* for each system *)
             let module M = (val system : SYSTEM) in
             (* update the state according to the message *)
             let state = M.update msg state in
             (state, system)) systems

In this way, the individual systems can be modular, loosely coupled to each other and easy to understand/test (the behaviour of a system depends only on its state and the message it receives).

By encoding the messages as a polymorphic variant, it means that we don’t need to explicitly create a monolithic message type (It also gets around some cyclic dependency problems that you might run into if certain systems depend on the same message (and this can’t be solved by extensible types either)).

The problem that I’ve kept on running into with this is that functors can’t capture this behaviour:

module type SYSTEM = sig 
   type state (* okay, no problems *)
   val update : [> ] -> state -> state (*  doesn't work - or not in the way one intends it to *)
end

Judging from previous questions that have been asked on this forum, I believe I’m not the only one who has tried and run into this wall in the past.

Anyway, I recently realised the solution to my problem - I was trying to use the module system to represent something I should have done in terms of OCaml values instead.

We start with a GADT to represent a system:

type 'b system =
    Mod : ('a * (([> ] as 'b) -> 'a -> 'a * 'b list)) -> 'b system

(The existential type 'a represents the state of the system, and 'b represents the final type of messages accepted by the system.

We can then define our systems by encoding them not as modules, but as instances of this GADT:

module Background: sig
  val system: [> `Incr | `LIncr ] system
end = struct
  let system =
    let init = 0 in
    let incr = function
      | `Incr -> fun n -> n + 1,[`LIncr]
      | _ -> fun n -> n,[] in
    Mod ((init : int),  (incr :> [> ] -> int -> int * [> ] list))
end

module Volume : sig
  val system: [> `FIncr | `Incr ] system
end = struct
  let system =
    let init = 0. in
    let incr = function
      | `Incr -> fun _ -> 0., []
      | `FIncr -> fun n -> n +. 1., []
      | _ -> fun n -> n, [] in
    Mod ((init : float),  (incr :> [> ] -> float -> float * [> ] list))
end

Finally, with this, we can write a message bus function to deliver messages:

let update msg system = match system with
  | Mod (vl, fn) ->
    let vl', msgs = fn msg vl in
    Mod (vl', fn), msgs

let rec update_all (acc, msgs) msg = function
  | [] -> List.rev acc, List.concat msgs
  | hd :: tl ->
    let hd, omsg = update msg hd in
    update_all (hd :: acc, omsg :: msgs) msg tl

let update_all = update_all ([],[]) 

The beauty of this is that the type system will track the shapes of polymorphic variants accepted by the systems exactly as we would want:

let example = update_all `Incr [Background.system;Volume.system]
5 Likes

What do you do about the open row variable - a weak global variable here that prevents
compilation?
Is it necessary to convey such values as state threaded through one’s program?
Is it foolish to want a way to signal the compiler to close the row variable (after all
constructors have been seen) so that storage at global level is possible?
This seems not unlike how OCaml doesn’t do finalsation of object types
but might benefit from explicit instruction to do so (ie. in course the of generating an executable)

Yeah, good point. I was mainly thinking of a design in which all components are encoded in terms of these modular systems, but when you start adding other global state into the mix, then explicit anotations will be needed I guess.

The threading is a little annoying - you can use a mutable type to mitigate some (but not the issue of being able to have global state) of the issues there - i.e the type of systems would be:

type 'b system =
    Mod : ('a * 'b Queue.t -> (([> ] as 'b) -> 'a -> 'a)) -> 'b system