How to combine 3 monads: Async/Lwt, Error and State?

Hi all, I need some help with the problem in subject. A bit of context: I’m writing a program to import data from a legacy database into a new one for a new website, written in Django. The source of data are two: the database itself or CSV files (yes, still in 2021…) but in any case I read a record at a time using an Lwt_stream, something like this:

let%lwt final_state =
  Lwt_stream.fold_s (fun record state
    (* import procedures here *)
    (* … … … *)
    (* write the new record into the new DB *)
    Lwt.return state
  ) record_stream initial_state in

The various import steps can “hard fail” (returning Lwt_result.Error or something like that) or “soft fail”, for example I decide to manipulate the old record to keep it. In this last case the procedure will return an Ok something. The error monad here is used to skip all the steps after an hard fail, as usual.

In this scenario the module Lwt_result is perfectly ok but I need more. The importing functions need to carry a state containing data related to the import process: how many records did I import? How many fatal or soft errors occurred and why? From this state I’m requested to write down a detailed report to inform customers with particularly bad data.

The state monad seems to be perfect but I can’t… write it.

So the question is: how such a monad should look like? How to write bind, return (and friends) and how to use let* to fit the 3 monads together so that the resulting monad (async error state?) play nicely with the rest of Lwt?

In this moment I’m using Lwt_result and for the state… well… a global mutable :sweat_smile: which is extremely ugly.

1 Like

How about simplifying the problem and keeping the state in, say, a SQLite database? Could be in-memory or on-disk.

EDIT: the import statistics could even be kept in the target (destination) database. That would give you atomicity when updating them.

1 Like

This is an option of course and after all it’s what I’m doing with a mutable state.

I’d like to learn something new and I think this could be an interest topic for many ocamlers.

Gotcha. I believe this will work:

module Aes : sig
  type (+'a, +'e, 's) t
  (** ['a] = return
      ['e] = error
      ['s] = state *)

  val return : 'a -> ('a, 'e, 's) t
  val fail : 'e -> ('a, 'e, 's) t
  val lift : ('a, 'e) Lwt_result.t -> ('a, 'e, 's) t
  val eval : 's -> ('a, 'e, 's) t -> ('a, 'e) Lwt_result.t
  val exec : 's -> ('a, 'e, 's) t -> ('s, 'e) Lwt_result.t
  val run : 's -> ('a, 'e, 's) t -> ('a * 's, 'e) Lwt_result.t

  val ( let* ) : ('a, 'e, 's) t -> ('a -> ('b, 'e, 's) t) -> ('b, 'e, 's) t
  (** Monadic bind *)

  val ( let+ ) : ('a, 'e, 's) t -> ('a -> 'b) -> ('b, 'e, 's) t
end = struct
  type ('a, 'e, 's) t = 's -> ('a * 's, 'e) Lwt_result.t

  let return a s = Lwt_result.return (a, s)
  let fail e s = Lwt_result.fail e

  open Lwt_result.Syntax

  let lift lwt_result s =
    let+ a = lwt_result in
    a, s

  let eval s t =
    let+ a, _ = t s in
    a

  let exec s t =
    let+ _, s = t s in
    s

  let run s t = t s

  let ( let* ) t f s1 =
    let* a, s2 = t s1 in
    f a s2

  let ( let+ ) t f =
    let* a = t in
    return (f a)
end

The key is to model the stacked monad type. Since the state monad is basically a function, an async state monad can just an async function. And an async state monad with an error channel can just use the Lwt_result module.

EDIT: expanded the module and made the appropriate type params covariant.

1 Like

I would strongly suggest looking at the monads library: Std (monads.Monads.Std). It’s entire raison d’être is to support composing the semantics of multiple monads in a principled way, and it provides the bits needed to make common combinations (e.g. Error + State) quite straightforward.

I seem to remember someone had posted an lwt + state implementation using it, but now I can’t find it. Perhaps @ivg would remember…

3 Likes

Would I be correct in assuming that you want to use monads? B/c unless your database access is via some sort of monadic library, you could just write in direct style and it’d all be straightforward, no?

I’ve avoided using Std just because it feels heavy and I’m concerned about losing performance. Do you know if this concern is just entirely unfounded (I haven’t had time to verify it)?

I had similar concerns prior to adopting it. I’ve since used it in some other minor spots, but my primary usage was in eliminating a global bag of state used as a dumping ground for planning-time warnings and then runtime instrumentation of a query language implementation. That was replaced with a state+result monad that made using the library much simpler, and the functional test suite actually ran faster afterwards by some ~negligible factor.

In terms of fair critique, I think monads.Std's API is sort of overwrought (in a Base/Core sort of way), which put me off for a bit, but the benefits are totally worth it IMO.

1 Like

The monads library provides the transformers for some well-known monads. All these monads have a more or less standard implementation, offering the same performance as any other monadic library can offer. Like there is no better way of implementing the state monad other than a function. We have experimented a lot with different performance optimizations, such as boxing and unboxing it and inlining various operators, and keep experimenting to get the maximum from the current compiler. In BAP, we heavily use the monads library, first of all for our knowledge representation and reasoning engine, which is the foundation for all BAP analyses. We also use it for emulating binary programs. The rich interface is here to make our life easier and more comfortable when we use monads. It definitely comes for free1 as the number of functions doesn’t affect the performance of the underlying monad.

But… there is always a but :slight_smile: Stacking monads using a transformer does have a price. Even with the flambda compiler. The latter is doing an excellent job of unstacking them and eliminating the overhead of having a chain of monads. But our latest experiments show that a custom-made monad (still with the monads library) performs better under either branch of the compiler. We have rewritten our main monads that were relying on transformers and got from 20% to 50% performance improvement. But that is not to say that the monads library itself is slow or that we’re not using it, it is to say that there are other options to transformers that might work in some cases. See the linked PR if you want to learn the trick.


1) Provided that we ignore the size of the executable, e.g., linking the core_kernel library results in a quite large binary, which may increase the startup time. Insignificantly, but in some use cases, it might be imortrant.

4 Likes

As it was already suggested, you can use monad transformers, to compose several monads into a single monad. As a show-case, we will use the monads library (disclaimer, I am an author of this library), which you can install with

opam install monads

It offers most of the well-known monads in a form of a monad transformer, which in terms of OCaml, is a functor that takes a monad and returns a new monad that enriches it with some new behavior. For example, to make a non-deterministic error monad, we can do Monad.List.Make(Monad.Result.Error) and get a monadic structure (i.e., a module that implements the Monad.S interface) that is both a list monad and an error monad. The small caveat is that the operations of the wrapped monad, the error monad in our case, are not available directly, so we have to lift them, e.g.,

let fail p = lift @@ Monad.Result.Error.fail p

So that in the end, the full implementation of the transformed monad still requires some boilerplate code,

module ListE = struct
  type 'a t = 'a list Monad.Result.Error.t
  include Monad.List.Make(Monad.Result.Error)
  let fail p = lift@@Monad.Result.Error.fail p
  (* and so on for each operation that is specific to the wrapped monad *)
end

Now, let’s try wrapping the Lwt monad into the state. We don’t want to add the Error monad because Lwt is already the error monad and adding an extra layer of errors monad is not what we want. First of all, we need to adapt the Lwt monad to the Monad.S interface, e.g.,

module LwtM = struct
  type 'a t = 'a Lwt.t
  include Monad.Make(struct
      type 'a t = 'a Lwt.t
      let return = Lwt.return
      let bind = Lwt.bind
      let map x ~f = Lwt.map f x
      let map = `Custom map
    end)
end

If we want to keep the state type monomorphic, then we will need a module for it. Suppose your state is represented as,

module State = struct
  type t = string Map.M(String).t
end

Now, we can use it to build our State(Lwt) Russian doll,

module IO = struct
  include Monad.State.T1(State)(LwtM)
  include Monad.State.Make(State)(LwtM)

  (* let's lift [read] as an example *)
  let read fd buf ofs len =
    lift (Lwt_unix.read fd buf ofs len)
end

The Monad.State.T1 functor is used to create the types for the generated monad. You can write them manually, of course, like as we did in the List(Error) example, but the type generating modules are here for the convenience1

Now, let’s get back to the problem of the lifting. It looks tedious to impossible to lift every operation from Lwt. Commonly, we try to put the smaller monad inside, to minimize the work, but it doesn’t work with Lwt as the latter is not a transformer. So what is the solution? For me, the solution is to not lift the operations at all, but instead, define your IO abstraction and hide that it is using Lwt underneath the hood. This will make the code that uses this new abstraction more generic and less error-prone so that it can focus on the business logic and the implementation details could be hidden inside the monad implementation. This is what the monads are for, anyway.


1) We omit the types from the output of the Make functor since for a long time OCaml didn’t allow the repetition of types in a structure so having the types in it will prevent us from composing various flavors of monads using include. It is also a long-time convention widely used in many OCaml libraries, including Core and Async. A convention that we probably don’t need anymore.

6 Likes

Thank you all for the replies, especially @ivg for the detailed explanation of the monads library.

For @Chet_Murthy: yes, I want to use monads for the simple reason that I feel very comfortable with this “design pattern” which I consider very neat and clean.

1 Like

Thank you for the response @ivg and @cemerick . Looking at monads I would like to use it but currently what’s keeping me away is the core_kernel dependency. It’s a bit more than I’m willing to pull in even if it’s beneficial. Do either of you know if there are plans to make it more standalone?

The usecase I think I would benefit from is I have my own concurrency monad and I cam currently implementing concurrency + option and concurrency + result monads myself which has its share of issues. The monads library would also get me the let* and and* for free I think (free applicatives?)

Thank you for getting me to dig into the monads library again.

1 Like

I doubt it would be possible, as it will break the interface. The most breaking change is the dependency on Core’s Sequence.t in the interface of each monad. Also the multistate monad has a lot of dependencies on Core.

Well, this is not that big deal :slight_smile: Only and* is non-trivial, but still very simple,

      let (let*) = (>>=)
      let (let+) = (>>|)
      let (and+) x y =
        x >>= fun x ->
        y >>| fun y ->
        (x,y)
      let (and*) = (and+)

My suggestion is to implement them using the same representation as we did in the Primus monad, e.g.,

type state (* if you need one *)
type error (* your error type *)
type 'a thread (* your concurrency monad *)
type r = error result thread
type 'a t = {
  run : 
      reject:(error -> state -> r) ->
      accept:('a -> state -> r) ->
      state -> r
}

This monad gives you State, Error, and Cont all for the same cost as any of these monads alone. You can even add [@@unboxed] to turn this monad into a function, but my experiments and angstrom (they use a similar monad) show that it will have a negative impact on performance, but YMMV so try it.

2 Likes