A tutorial on parallel programming in OCaml 5

I ran a hands-on tutorial on the new parallel programming primitives in the upcoming OCaml 5 at the Tarides off-site last week. It covers the low-level parallelism primitives exposed by the OCaml 5 compiler as well as high-level parallel programming using domainslib. I hope you like it and find it useful. Please feel free to open issues if you find anything amiss.

38 Likes

As it is not immediately clear for me, does it uses threads , green threads, processes , fibers ? And who is responsible for the scheduling ,the Ocaml application or the underlying operating system ?

1 Like

Each domain corresponds to one system thread. The scheduling between them is therefore performed by the operating system.

The tutorial only covers domains, which are the way to perform parallelism in OCaml 5. To use concurrency (e.g. having several IO-depending operations that run concurrently on the same core), the main mechanism is effects (which at the level of the runtime system, are implemented using small stack segments called fibers), as in the eio library. Effects allow such libraries to provide a form a lightweight threads (aka green threads) whose scheduling is implemented in the OCaml application using effect mechanisms.

5 Likes

Thank you for spending the time doing this. One query.

You say “Whenever a domain exhausts its minor heap arena, it calls for a stop-the-world, parallel minor GC, where all the domains collect their minor heaps.” If a domain collecting its minor heap includes moving entities to the major heap, and if one domain can trigger such a minor heap collection of another domain, how do the CAMLparam macros now work correctly? In theory there could be a period of vulnerability between a function beginning and setting up its stack frame and its value arguments, and the macro being applied to those arguments, caused by alloctions of an entirely different domain.

Indeed if every domain has its own minor heap, what’s the point of one domain’s minor heap collection triggering another domain’s collection?

Quoted directly from the tutorial

Domains are heavy-weight entities. Each domain directly maps to an operating system thread.

The rest of the answers you seek are also there in the tutorial. The concepts are introduced in a piecemeal fashion. I encourage you to have a read.

4 Likes

For the CAMLparam macros, the discipline is the same as OCaml 4 – the user needs to ensure that allocations functions (caml_alloc*) are not called before the parameters are registered.

For the stop-the-world sections, OCaml 5 doesn’t stop a domain execution arbitrarily at any point (by interrupting the threads with signals, for example). Whenever a domain triggers a stop-the-world section, the other domains will participate in the stop-the-world barrier at their next allocation point. This ensures that the discipline followed for the correct use of CAMLparam macros in OCaml 4 is sufficient in OCaml 5.

Indeed if every domain has its own minor heap, what’s the point of one domain’s minor heap collection triggering another domain’s collection?

There could be cross minor-heap pointers. We cannot independently collect the minor heap of domains without handling this case.

OK I see, thanks. It would be good if the equivalent of chapter 20 of the ocaml manual for ocaml-5 when it comes out is were to make some comment along those lines, and I imagine it will do so.

This is an impressive body of work.

Here is a very simple tutorial on parallel programming in OCaml: use parany !

For OCaml 5, use the right branch of parany:

Happy hacking!
F.

2 Likes