Lwt not sequencing correctly?

Hi guys,

I am facing some strange issue when using Lwt and Lwt_stream. Here is the code in question, lwt1.ml

If i do dune exec ./lwt1.exe then it gives me Fatal error: exeption Lwt_stream.Closed

However, if I comment line 26 and uncomment line 29 it works.

I have defined *> to be let ( *> ) a b = a >>= fun _ -> b

my question isn’t *> same as line 29? what’s the diff? Why does line 29 work and not when using *> since they are - AFAIU - sematically the same ?

it seems the a in (*>) a b doesn’t get called at all or something?

1 Like

Operator *>'s arguments will be evaluated eagerly. You could try a ppx, I guess.

3 Likes

Isn’t that a simple operator precedence issue ? Have you tried putting more parentheses ?
I think e1 *> e2 >>= fun () -> e3 is parsed as (e1 *> e2) >>= fun () -> e3, while e1 >>= fun _ -> e2 >>= fun () -> e3 is parsed as e1 >>= (fun _ -> (e2 >>= fun () -> e3)).

That’s as @cvine said: arguments in functions calls are evaluated eagerly. And also left-to-right so it’s even more difficult to parse your head around ( *> ). In fact a () *> b () is equivalent to let x = b () in let y = a () in y *> x which is not what you meant.

I believe that’s a step everyone takes when becoming familiar with Lwt. I have certainly done it. Twice in fact.

You can use a ppx (as suggested) or you can use the form let* () = a in b. I think there was also plans to add ;* to go along let*. Can anyone confirm this? I might have misremembered.

2 Likes

IIRC, the camlp4 extension used to expand e1 >> e2 like e1 >>= fun () -> e2 but that was a bit confusing since >> was not an actual operator. Then ;%ext got added to the language so that the above could be written as e1 ;%lwt e2.

is this in lwt_ppx package?

Edit:
I redefined *> as let ( *> ) a b = a;%lwt b
but it seems to also exhibit similar behaviour as previous code. I looked at the generated code and I believe it is the same as the earlier code, i.e. a >>= fun _ -> b. Here is the ppx generated code for reference

let ( *> ) a b =
  let module Reraise = struct external reraise : exn -> 'a = "%reraise" end in
    Lwt.backtrace_bind (fun exn -> try Reraise.reraise exn with | exn -> exn) a
      (fun () -> b);;
val ( *> ) : unit Lwt.t -> 'a Lwt.t -> 'a Lwt.t = <fun>

Indeed, this was the issue. I changed the *> to let ( *> ) a b = a >>= fun _ -> b () and line 30 to fun () -> and that seems to do the trick.

Yes, this case is handled here: lwt/ppx_lwt.ml at 650d64efe2bc96f0877b954df432bfee97295986 · ocsigen/lwt · GitHub

It is somewhat unintuitive, but precedence and associativity do not affect the evaluation order of arguments, which is unspecified in ocaml but is in fact right-to-left. So this calculation of (2 * 3) + 4 will evaluate the 4 first, then 3 and then 2 before carrying out the arithmetical calculation in accordance with the required precedences:

let () =
  let v = ((print_endline "evaluating 2" ; 2)
           * (print_endline "evaluating 3" ; 3))
          + (print_endline "evaluating 4" ; 4) in
  Printf.printf "%d\n" v

And this calculation of 2 * (3 + 4) will do the same:

let () =
  let v = (print_endline "evaluating 2" ; 2)
          * ((print_endline "evaluating 3" ; 3)
             + (print_endline "evaluating 4" ; 4)) in
  Printf.printf "%d\n" v
2 Likes

Somewhat off topic but I’ve seen people bitten by eager evaluation for Lwt many times (myself included). Wrote some notes about it here. Also curious to hear good counter arguments for why eager semantics is preferable. I’m not suggesting changing Lwt but potentially providing an abstraction on top of it.

2 Likes

That’s an interesting post but I cannot say I have found eager evaluation as such catching me by surprise, since most common languages that beginners use also feature eager evaluation. Casting my mind back to first using Lwt I should have thought the issue was rather more a matter of getting the hang of monadic composition of promises. Get it wrong and you can end up with concurrent operations when you intend sequential ones, complete with preplexing errors (and lazy evaluation is likely to give even more preplexing errors than eager evaluation). Get it right and the bind operators (whether let*, let%lwt or operator >>=) will do it all correctly for you.

What eager evaluation does perhaps bring with it is the need for a macro system to enable some classes of user defined operator to work effectively. I know lisp and scheme macros well, and at a pinch in C I have resorted to the C pre-processor. But I have tended to keep clear of ocaml ppx’s because they seem to require you to understand a considerable amount about the language’s AST, which can change.

Thankfully the let* and let%lwt macros already provide convenient monadic composition for us. The >>= operator is also pretty usable even without macros.

2 Likes

I’m not an Lwt maintainer, but I think a key advantage of the eagerly-evaluated semantics is performance: a lazily-evaluated “action” in OCaml must first be allocated, so that it can later be traversed by the scheduler that’s driving the computation. I’d expect a program written in terms of unit -> 'a Lwt.t to spend noticably more time doing minor GCs. (Rust sidesteps this by using mutable poll-based futures rather than callbacks, but these obviously aren’t referentially transparent.)

On the usability side, I agree that the eager semantics of Lwt.t values are surprising, but think this is at least partly down to branding: a name like Promise.t or Future.t (or even Deferred.t …) seems more natural. IMO, recent versions of the Lwt manual do a good job of damage control for this misunderstanding (but for many the instinctive expectation is still the wrong one). That said, languages that use those names also have issues with users misunderstanding their semantics – and Scala in particular has quite a few external libraries providing “tasks” or “actions” instead.

Finally, I think a “task” model might not be a strict improvement w.r.t. understandability. An advantage of the eager semantics is that one can interleave regular direct-style effects with the Lwt ones and get sensible behaviour:

let do_the_thing () =
  Fmt.pr "Initialising the reactor core:@.";
  Reactor.start () >>= function
  | Error _ -> raise Keep_calm_and_carry_on
  | Ok monitor -> Lwt.return monitor

(e.g. I’m guaranteed that after printing the “Initialising” message I do actually make some progress towards starting the reactor until that process chooses to cede control.)

Another advantage of promises vs. tasks is in modelling resource ownership: with Lwt, I can delegate responsibility for cleaning up an object to a chain of promises (e.g. val read_lines : file_descriptor -> string Lwt_seq.t), and the scheduler can take care of closing this resource for me. If I’m trying to build referentially-transparent actions instead, I can’t delegate ownership to them without risking double-free bugs.

3 Likes

Thanks, all good points! A few comments/questions.

Since the point of Lwt is IO I would imagine this to be a very minor peformance hit but perhaps I’m missing something. With:

type 'a t = unit -> 'a Lwt.t

let ( >>= ) m f () = Lwt.bind (m ()) (fun x -> (f x) ())

A program, p >>= f >>= g, would evaluate to:

fun () -> Lwt.bind (Lwt.bind (p ()) (fun x -> (f x) ())) (fun x -> (g x) ())

Given that p, f x or g x does anything substantial, should I be worried about applying the extra
closures and additional GC?

Fair. I guess one would rather use the monadic versions for all side-effects, i.e:

let do_the_thing =
  let* () = print "Initialising the reactor core:@." in
  let* res = Reactor.start () in
  ...

You mean specifically for something returning a sequence like structure? Where does a bracket combinator like the following go wrong?

(* val bracket : 'a Task.t -> finalize:('a -> unit Task.t) -> 'a Task.t *)
let bracket p ~finalize = fun () ->
  Lwt.bind (p ()) (fun x -> Lwt.map (fun _ -> x) @@ finalize x ())
1 Like

My (anecdotal) experience with using Lwt in IO-heavy applications is that it already has a very non-negligible allocation cost (i.e. even after performance tuning, it’s still a large factor in perf / memtrace profiles), but this will obviously depend on how “substantial” your actions are.

A big factor is obviously that Lwt-ified functions bleed throughout the stack, so using Lwt for IO incurs a performance penalty proportional to the depth of your stack – every time one factors out a function on the route from the top-level API to the Lwt-ified IO layer, there’s a good chance of introducing another allocation per IO operation. (The symptom is that my Lwt programs have a lot of Lwt.maps and Lwt.returns necessary to appease the type system.) YMMV.

The () wrappers have a similar characteristic, which I think is made a bit clearer by expanding f and g, assuming they both do something useful and then massage the result in some way:

   p >>= (fun f_x -> ... return f_y) >>= (fun g_x -> ... return g_y)

  (* becomes *)
  
   p >>= (fun f_x -> ...<some-async-action>... fun () -> Lwt.return f_y)
     >>= (fun g_x -> ...<some-async-action>... fun () -> Lwt.return g_y)

  (* becomes *)

  fun () ->
  Lwt.bind 
    (Lwt.bind
      (p ())
      (fun x -> (fun f_x -> ... fun () -> Lwt.return f_y) x ()))
    (fun x -> (fun g_x -> ... fun () -> Lwt.return g_y) x ())

I think the problem here is that the x and () arguments are never passed together: you first pass x, which does some computation and allocates a closure representing the next task, and then pass () to execute that task. Again, for some definition of a “substantial” async task this won’t matter, but in other cases this will be a non-trivial perf loss.

This is fair enough, although now one would seem to also want a monadic encoding of mutations / exceptions, and sooner or later we may as well be writing Haskell :slight_smile: (Kidding, of course.)

I was thinking more generally of cases where I want to provide resources to an IO-using operation (e.g. a task equivalent of Lwt_unix.read : file_descr -> bytes -> off:int -> len:int -> int Lwt.t), i.e. when I want to either temporarily or permanently delegate control of a resource. Using bracketing works, but I think your bracketing combinator is missing an ~acquire phase (a left bracket :slight_smile: ):

val bracket : ('a -> 'c Task.t) 
           -> acquire:'a Task.t
           -> finalize:('a -> unit Task.t)
           -> 'c Task.t

Otherwise, I think the implication is that the inner task being passed to bracket already has ownership of some value – so it’s not referentially transparent. c.f. Haskell’s bracket and Cats’ bracket.

This model works, but it requires my resource usage to actually be well-bracketed, and in my real-world code this is often not the case: e.g. I have a “database handle” or some such that lives throughout my program lifetime and needs to keep a single FD open in order to write and read from many times, so the write and read functions can’t be well-bracketed tasks (it’s not acceptable to open and close my FD each time I want to interact with it). To work around this, I’d have to have my database handle be well-bracketed too, and so this obligation works its way up the stack. (In Tezos, these FDs are three or four libraries deep.)

Interestingly, the Cats library has a solution for this boilerplate propagation, which is to add another monad layer for each resource :grin:

To be clear, I wasn’t trying to dismiss the idea of a referentially-transparent tasks a priori: I think they’re a nice abstraction and would fit some use-cases really nicely. I was just thinking about the implications for my own usage of Lwt.

1 Like

Right, your combinator makes more sense.

Hmm, struggle to see the difference with a promise-based solution. You acquire your resource, pass it around, do some stuff, and close it as a last step. How do you address that?

Haha, sounds scary. Speaking of Scala, Zio looks like a very pragmatic alternative. I will take a look and see what they have to offer on that front.

Sure, really appreciate your points. As always it comes down to trade-offs.

That is a most interesting exchange. Moving from actions (and promises running eagerly) back to the mundane, namely the eager evaluation of arguments, the only time I have nearly been caught out on that with promises is when the function to be bound is derived from a partial application, as the partial application occurs outside the scope of the bind. Using Result.t types instead of Lwt.t types, what this snippet of code prints may surprise some people.

let () =
  let (>>=) = Result.bind in
  let add2 a b = Ok (a + b) in
  let res = Ok (print_endline "evaluating 2" ; 2)
            >>= add2 (print_endline "evaluating 3" ; 3)
            >>= add2 (print_endline "evaluating 4" ; 4) in
  match res with | Ok i -> Printf.printf "%d\n" i
                 | _ -> assert false

Edit: Of course, using let* or let%lwt instead of operator >>= eliminates this kind of problem.

1 Like