Multi-shot continuations gone forever?

cyberpink · December 30, 2021, 9:44pm

What happens with multi-shot continuations now that Obj.clone_continuation was removed? (https://github.com/ocaml-multicore/ocaml-multicore/pull/651)

Anything that requires a “fork” operation, like say, a probabilistic programming EDSL, needs this. None of the old examples I’ve looked at like Delimcc on top of effects have been updated to use a new method, and I haven’t been able to find any hints of one.

Are multi-shot continuations just not possible now? Are there plans to add something equivalent back in later?

nojb · December 31, 2021, 1:07am

Yes, multi-shot continuations are gone and is unlikely that they will find their way back any time soon. One (good) reason is explained in https://dl.acm.org/doi/10.1145/3434314 :

and

Cheers,
Nicolas

gadmm · December 31, 2021, 3:54pm

I think the question still stands. You cut the sentence “Extending our system with multi-shot continuations is future work (§8)”. Also the paper is about a particular model based on separation logic rather than OCaml itself (for instance the authors also mention that their continuations are affine instead of linear unlike in OCaml multicore).

Nevertheless, the multicore designers were aware that duplicating continuations makes it complicated to reason about resources. The topic of mixing continuations and linearity has been better studied from the angle of algebraic models of effects and proof theory. Essentially, with an effect system you could ensure that certain kinds of effects do not happen in the delimited part of the program (including allocating a resource), which controls copiability of the stack from the point of view of reasoning about the program. This is inspired by some logics that mix classical and intuitionistic or linear logic. From this angle the ability to copy a continuation would be restricted to a sub-part of the language which is pure to some degree. This should also be a suitable starting point if one wanted to develop a program logic to formalise the reasoning about such programs.

However according to #651 there were more technical reasons to drop clone_continuation, such as breaking compiler optimizations. I am curious as well to know whether there are plans to reintroduce clone_continuation at some point, but obviously this would require some effort.

kayceesrk · January 1, 2022, 4:36am

@nojb and @gadmm have already answered why we’ve dropped support for clone_continuation now. We will need to track the copiability of the continuation in the continuation type and compiler optimisations also need to be made aware of the possibility of copying. Given the pervasive nature of its effects, there are no immediate plans to bring the feature back. We will have to come back to this after we have typed effects.

One can get pretty far with PPL with just one-shot continuations. My student and I did some experiments building a DSL for a PPL to learn about the space: GitHub - Arnhav-Datar/EffPPL: Initially an Algorithmic Diffrentiation Library. Now has been extended to a Probabilistic Programming Library. Having spoken to PPL experts there are indeed some usecases where multi-shot continuations are useful, but from what I understand, the one-shotness isn’t a blocker for PPL.

I would be interested in collecting usecases where multi-shot continuations are absolutely necessary.

gasche · January 3, 2022, 4:32pm

Interesting!

My (probably naive) mental model of HANSEI-style libraries, using multishot continuations, is that they are extensions/generalization of a non-probabilistic “logic/non-deterministic monad” that searches for the set of solutions to a problem. Multishot continuations are naturally used in non-deterministic computations at backtracking points, to explore different search directions and collect the result. It is possible to avoid multishot continuations by replaying the whole search from the start each time (reference: Capturing the future by replaying the past), but this involves duplicated computations so it is less efficient (reference: Asymptotic speedup with first-class control).

Can you give some intuition of how other approaches to probalistic inference work, that do not require multishot continuations? Are they also duplicating computations, or are they using a magic trick to avoid this issue with a different inference algorithm?

I tried to find an answer to this question by reading the internship report, but I couldn’t locate an answer. (The report mentions HANSEI in the related work, but it does not discuss this question.) The report explains that the inference algorithm, called HMC (Hamiltonian Monte Carlo), uses automatic differenciation; so it uses a sort of symbolic manipulation / analysis of the probabilistic program to sample. But does this avoid repeated computations? It may be the case instead that the differential is as large or larger than the program itself, and that the search algorithm using this differential in effect perform a program-sized computation at each search step, duplicating computations.

sadiq · January 3, 2022, 5:00pm

Not a PPL but I’ve been hacking on a little effects-based model checker for concurrent data structures that implements dynamic partial order reduction (GitHub - sadiqj/dscheck - a WIP!). Multi-shot continuations would have been very useful.

I ended up implementing something that involves maintaining a schedule and repeatedly replaying the computation. It looks very similar to what Capturing the future… proposes.

n4323 · January 3, 2022, 6:56pm

As far as I understand HMC, it does not address the problem of avoiding repeated computations. Instead, it provides a way to utilize knowledge of the derivative of the target (likelihood/score/posterior/measure/…) function to propose efficient sampling update steps. This has the potential to avoid many attempts to sample “in uninteresting directions” and in this way can speedup sampling. As far as I am aware, in the context of statistical physics and Bayesian parameter inference where HMC is used, the idea of reusing partial evaluations of the target function is not widespread. The target function is regarded as a function, not so much as a probabilistic program.
If I’m not up to date, I’d love to learn otherwise!

greedy · January 6, 2022, 3:07pm

I’m working on an implementation of Abstracting Definitional Interpreters. I was pretty excited about the prospect of using effect handlers but I would need multi-shot continuations to realize the non-determinism effect.

dhil · January 6, 2022, 11:45pm

For what it is worth, I am in the process of building a library for OCaml 5.00 that provides continuations with multi-shot semantics on top of OCaml’s native continuations. I have managed to get an initial prototype implementation working, though, it is still a bit rough around the edges (and probably has some bugs). E.g. at the time of writing it only works with native code compiler. I do plan to support for the byte code interpreter too. You can track the progress via GitHub - dhil/ocaml-multicont: multi-shot continuations in OCaml.

kayceesrk · January 7, 2022, 3:10am

Thanks Daniel. This is a good compromise. It allows the ecosystem to experiment with multishot continuations while not forcing the decision now on the multicore team.

The documentation may include problematic examples to let the user know what can go wrong.

gasche · January 7, 2022, 8:03am

It looks like ocaml-multicont tries to make it easy to mix linear-continuations and multishot-continuations in the same code. My experience was that this always blew up in my face.

(As @dhil surely knows, but for context:) As soon as “multishot effects” get nested (which is very soon when they are used for backtracking), the operation “inside” cannot safely by handled using linear continuations, as its handling logic may be resumed several times by the operation “outside” – this is the essence of the clone_is_tricky example.

Unless I missed something, the robust way to write handlers for nestable multishot operations is to assume that all continuations can be resumed several times. At the time of my experiments, that meant constantly using Obj.clone_continuation before continuing the (copy of the) continuation. In your library, I guess that would mean very careful to use only (multishot) resumptions, and never (linear) continuations. So I’m not sure what is the use-case for mixing both styles together.

gasche · January 7, 2022, 8:12am

On a related note: I don’t understand how to correctly drop a multishot continuation, if I never know if my own code is going to be resumed later. The only approach I know of would be for a resource-aware drop operation, that walks the stack and recursively drops the continuations that it itself binds. At this point one can ask for copy to work in a similar way. But this is not something that this library implements.

(Dreaming aloud: we could have a Custom-like class of dynamic values with explicit copy and drop operations, and explicit marks in our stacks for binding references to values, denoting that the stack captures the binder to this value. I wonder how that compares to @gadmm’s stuff.)

dhil · January 7, 2022, 12:56pm

Yep, this initial version has all of the usual caveats of Obj.clone_continuation. I have been an avid user of Obj.clone_continuation for my own research, so my primary interest is to bring it back so I can continue my work. Hopefully as we get experience programming with multi-shot continuations in an environment like OCaml, we can converge towards a more sensible design.

gadmm · January 7, 2022, 3:01pm

You point out the conflict between the following two statements: 1) copying a continuation only makes sense if we restrict them not to own resources, 2) fibers are themselves resources!

We get out of the conflict as follows: make multi-shot continuations purely copiable values (managed by the GC). This makes sense because they are not supposed to contain any resource that requires prompt clean-up. To do so, promoting a fiber into a multi-shot continuation either copies it to the heap and immediately discontinues the original linear value, or it attaches a finaliser to discontinue it when there are no more references to it. Then resuming the multi-shot continuation involves cloning it into a new fiber which is immediately instated (which is what @dhil’s library does). Thus fibers are only present in a transient manner before being promoted or reinstated, so you never need to drop or clone them explicitly.

Independently, it makes sense to have a clone operation for some resources (e.g. for sharing a resource between threads using reference-counting). But I do not see a use-case for permitting multi-shot continuations to own resources via making clone_continuation aware of the resources it has to clone. This looks complicated, and until there is a good use-case it is better to stick with a clear distinction between linear continuations which can own resources and multi-shot continuations which cannot. Surely this is enough to support probabilistic/non-deterministic programming. In the case of drop, multicore is designed around a simple solution which is to resume a linear continuation with an exception, which is why the problem does not arise.

My own stuff is based on starting from the Bacon-Cheng-Rajan paper and noticing that the useful design-space is covered by two simple well-behaved concepts of copiable values (handled by GC) and linear values (with effectful drop operation called promptly), which complement each other. Recursively cloning continuations would be against the philosophy. It supports “clonable” resources (which are regular resources, but with some clone operation), but does not make this a central concept as in Rust.

gasche · January 7, 2022, 3:28pm

Thanks, your explanation is very clear. But now I realize that I was confused about what ocaml-multicont implements: I thought it involved using clone_continuation in the usual (fragile) way, but from your explanation I see that it is substantially better. In fact it would suffice to have as an unsafe interface

external share : ('a, 'b) continuation -> ('a, 'b) resumption
external copy : ('a, 'b) resumption -> ('a, 'b) continuation

where share turns a linear continuation into a shareable resumption (A -o !A… assuming the continuation only captures resumptions, not continuations, so promotion is sound), and copy provides a linear copy of the shareable resumption (!A -o A).

(I think this is how existing “continuations” implementations, the call/cc one for bytecode and now delim/cc work, except they don’t expose the linear continuations as a first-class concept, they are just stack fragments that must be continued right after their recreation by a copy.)

I think that the current implementation itself confuses the matter (for me) by presenting resumptions exactly as continuations, programmed using a fairly different interface.

igarnier · January 7, 2022, 3:34pm

Hakaru10 implements an incremental MCMC, the page contains a nice description of the design. My current understanding is that this type of inference algorithm can be neatly described as representing a probabilistic execution trace as an incremental computation graph (similar to that implemented in libraries such as Incremental or lwd, etc). One picks a new trace by selecting a sampling node and picking a new sample from the attached distribution. The underlying incrementalization mechanism takes care of only recomputing the part of the computation graph depending on the selected node. Backtracking (as required by Metropolis-Hastings when a sample gets rejected) can be implemented either by reseting the sample to its previous value and replaying the sub-computation (assuming careful handling of side-effects) or implementing some explicit ‘undo’ by rewriting the graph to its previous state.

It’s getting off-topic, but I can’t resist adding that modularizing the incrementalization mechanism away allows for a quite succint and rather efficient implementation of a basic PPL implementing single-site Metropolis-Hastings. I’ve some (unreleased yet) experimental code there: dagger/src/traced.ml · master · Ilias Garnier / monorepo · GitLab

igarnier · December 11, 2022, 8:43am

I just tried out a multicont-based Sequential Monte Carlo (SMC) implementation versus a CPS-based implementation on a basic Kalman filter example.

In a nutshell, SMC works by iteratively updating a family of weighted continuations by dropping low-weight ones and cloning high-weight ones.

In the example, inference runs in 12.0 seconds for the CPS implementation and in 12.8 seconds for the multicont-based one. As can be witnessed, the multicont-based implementation is massively simpler. I love it. Thanks @dhil!

(side note: at the time of writing, you will need to opam pin add gsl git@github.com:Octachron/gsl-ocaml.git\#OCaml5_compatibility in order to compile the dagger-multicont branch)

kayceesrk · December 14, 2022, 3:13am

This is very nice to see. Thanks for sharing. If you have some pain points with using multi-shot continuations on OCaml 5, do let us know.

We don’t know how to do multi-shot continuations well in OCaml (yet). That shouldn’t stop us from experimenting and coming up with useful applications built on top of them. These applications will guide us in designing better abstractions for multi-shot continuations.

Topic		Replies	Views
User-friendly continuations over OCaml 5 effects? Ecosystem	12	1678	October 19, 2023
Language abstractions and scheduling techniques for efficient execution of parallel algorithms on multicore hardware Community multicore	19	3695	October 15, 2021
About Multicore Ecosystem multicore , compiler , faq	59	13377	April 10, 2018
OCaml multicore, effects and js_of_ocaml Ecosystem multicore , effects , js_of_ocaml	31	4121	October 5, 2021
Will OCaml 5+ multicore be fragile? Learning	22	4596	September 2, 2022

Multi-shot continuations gone forever?

Related topics