Lwt 3.2.0 released – we plan to change Lwt.bind

Lwt, the old concurrent programming workhorse, OCaml’s trusty promise library, has released version 3.2.0! It is now installable from opam.

opam update
opam upgrade lwt    # or opam install lwt


The highlights of this release are…

  • Completely new, and very thorough, documentation for the main module, Lwt. Unfortunately, since the Ocsigen infrastructure appears to be down, the new documentation can only be found at the link in the previous sentence, or by opening lwt.mli.
  • A planned change to the semantics of Lwt.bind – see below.
  • Ongoing Lwt repository and codebase cleanup. Part of that is announcing some breaking changes for Lwt 4.0.0 – also see below.
  • Lots of bugfixes, improvements, and cleanups by many contributors. See the full changelog. The easy issues list has been considerably depleted :slight_smile:

To break in 4.0.0

It is a rule of the Lwt project, that we announce all breaking changes ahead of time, give gentle transition paths, and at least three months for users to adapt.

In that fashion, Lwt 3.2.0 announces and prepares some breaking changes, which will be committed in March 2018, with the release of Lwt 4.0.0. Full details are in a notice in an Lwt repo, but a summary is included here.

  • We discovered that the semantics of Lwt.bind, and most other basic Lwt functions, are somewhat broken. So, we will adjust them in 4.0.0. The notice gives a branch you can pin, to run your code now, before 4.0.0, against the future 4.0.0 semantics.
  • The PPX, Camlp4 extension, and logging library are being factored out into new opam packages lwt_ppx, lwt_camlp4, and lwt_log. The logging library is also being deprecated, while the Camlp4 extension was already deprecated.
  • We will remove the >> construct from the PPX.
  • lwt.preemptive has been merged into lwt.unix, and the package name lwt.preemptive will disappear.

If you are the maintainer of an affected package in opam, we have already found you and will cc you in the notice shortly! If you maintain a private codebase, we encourage you to monitor the Lwt announcements issue, which gets one comment for each Lwt release, linking to any planned breaking changes.


Happy concurrent programming!

15 Likes

I am pleased to see that this is topic 1337.

16 Likes

Most of these changes are good, but please reconsider changing the behaviour of bind. We have hundreds of thousands of lines of code using Lwt, and there is no way we can afford to go through all of them checking whether they are still correct with the new semantics.

And this change won’t give an error at compile time, nor (probably) during testing, since the behaviour changes depending on the stack depth when your function is called. Race conditions that only occur at particular stack depths are horrible :frowning:

I suggest making a new Lwt.Infix2 module with the new behaviour. Though I’d prefer that bind either always defers or never does. I’d rather have a crash or stack overflow than a race.

3 Likes

Yes, to everyone else: please give your opinion. We have three months to change our minds.

We might just give up on changing the existing bind, and create a new API with bind and several other problems fixed. That would probably be after 4.0.0, though.

2 Likes

You mention that book-keeping has to be done after the callbacks return, which means you have a deep call-stack.
Is sequence associated storage what you are refering to here, or is there something else?
No doubt sequence associated storage is useful in some situations (e.g. debugging/logging), but if it is the only thing causing troubles for Lwt.bind would changing the semantics or implementation of that be less invasive than changing the semantics of bind? (e.g. instead of a global try to thread-through something in the 'a t itself).

Though I’d prefer that bind either always defers

This would make Lwt’s bind work the same as Async’s, right? I guess at this point this is just a dream, but it could be a starting point in unifying both libraries’ cores… More realistically, it would make it simpler for libraries that try to support both via functors.

2 Likes

This refers to exception handling actually. “Book-keeping” is a pretty imprecise and useless term here, so I tried to clear up that part of the docs: https://github.com/ocsigen/lwt/commit/e8b74b0fddb2fca3fa15e28d621e06c0f5c02767

As for sequence-associated storage, it’s a pain and I’d love to remove it, but it’s not the problem here.

It would bring it much closer. I don’t have enough experience with Async to comment here.

It’s worthwhile looking into their implementation, because presumably they had to work around the same inefficiency you’re trying to amortize. Perhaps they have a solution that’s more consistent?

Always deferring seems a more uniform style that avoids the stack overflow issue. What are the potential drawbacks of switching to such a style?

I think the only issues are performance and breakage, but we are talking about breaking things either way.

We may have to defer this change in some way past 4.0.0, though.

Has any benchmark been done of the different models? Do you know of a real-life example that may break by switching to the “always defer” model?

I’m not aware of any realistic benchmark, just anecdotes.

I did do a measurement in the PR that initiated the process of changing the semantics. However, it is comparing Lwt’s current “time-tested” implementation of its current semantics, against an initial, extremely naive, and generalized implementation of deferral, which definitely allocates too many closures, etc. You can find it under “Effect on performance” at the link. I got 8ns per bind with Lwt’s current code, and 144ns per bind when always deferring. I’d be happy to see a more useful comparison, and maybe I will do one later.

I’m not personally aware of one, as basically all Lwt code I’ve written doesn’t make an assumption about when callbacks run, and also all Lwt code I recall reading didn’t make such assumptions either.

@talex5 has mentioned that this could break mirage-firewall in slightly.

Having had time to discuss and think about the Lwt.bind change, I think it’s indeed better to postpone or completely abandon it. The reasoning is:

  • A lot of code would potentially have to be reviewed, and it’s difficult to do automatically.
  • The semantic problem, I think, is quite minor, when taken in context. The details are immediately below.

So, it doesn’t seem like the tradeoff is worth it. We may have a fresh “front-facing” Lwt API at some point, and then we will have the opportunity to get the semantics fully right without risking any big breakage.


The original issue is: Lwt.bind p f raises an exception if p is already resolved and f raises. This is inconsistent with the rest of Lwt, including Lwt.bind when p is not resolved: in all other cases, if f raises, the Lwt API puts the exception into rejecting a promise rather than raising it further up the stack.

Why this is probably not important: in an Lwt program, the exception leaked by Lwt.bind p f is ultimately going to almost always have the same “fate” as an exception used to reject Lwt.bind p f. Examples:

  • If the Lwt.bind p f is called within any other Lwt callback, which is very common when chaining Lwt computations, the Lwt API that is calling the callback will capture the exception and reject ultimately the same promise as Lwt.bind p f “should” reject.
  • If Lwt.bind p f is being called as part of resolving a promise that Lwt_main.run depends on, converting Lwt.bind to reject instead of raising will just cause Lwt_main.run to raise the exception from the rejeted promise right afterwards.
  • Leaking an exception to Lwt.async has the same effect as rejecting the promise returned to it, which is a call to !Lwt.async_exception_hook.

…and so on.

The above is not a proof. There are probably corner cases where the behavior is actually different in some material fashion. So, these semantics are still bad. However, the corner cases seem to be rare (nobody has ever reported an instance, we found the issue by reading the Lwt code), and so fixing this is not worth making people lose trust in already-written code.


Thanks to @talex5 and others for resisting this change, and to everyone else as well, for all feedback and testing :slight_smile:

11 Likes