Multicore OCaml: September 2021, effect handlers will be in OCaml 5.0!

Welcome to the September 2021 Multicore OCaml monthly report! This month’s update along with the previous updates have been compiled by me, @ctk21, @kayceesrk and @shakthimaan. The team has been working over the past few months to finish the last few features necessary to reach feature parity with stock OCaml. We also worked closely with the core OCaml team to develop the timeline for upstreaming Multicore OCaml to stock OCaml, and have now agreed that:

OCaml 5.0 will support shared-memory parallelism through domains and direct-style concurrency through effect handlers (without syntactic support).

The new code will have to go through the usual rigorous review process of contributions to upstream OCaml, but we expect to advance the review process over the next few months.

Recap: what are effect handlers?

Below is an excerpt from “Retrofitting Effect Handlers onto OCaml”:

Effect handlers provide a modular foundation for user-defined effects. The key idea is to separate the definition of the effectful operations from their interpretations, which are given by handlers of the effects. For example:

effect In_line : in_channel -> string

declares an effect In_line, which is parameterised with an input channel of type in_channel, which when performed returns a string value. A computation can perform the In_line effect without knowing how the In_line effect is implemented. This computation may be enclosed by different handlers that handle In_line differently. For example, In_line may be implemented by performing a blocking read on the input channel or performing the read asynchronously by offloading it to an event loop such as libuv, without changing the computation.

Thanks to the separation of effectful operations from their implementation, effect handlers enable new approaches to modular programming. Effect handlers are a generalisation of exception handlers, where, in addition to the effect being handled, the handler is provided with the delimited continuation of the perform site. This continuation may be used to resume the suspended computation later. This enables non-local control-flow mechanisms such as resumable exceptions, lightweight threads, coroutines, generators and asynchronous I/O to be composably expressed.

The implementation of effect handlers in OCaml are single-shot – that is, a continuation can be resumed only once, and must be explicitly discontinued if not used. This restriction makes for easier reasoning about control flow in the presence of mutable data structures, and also allows for a high performance implementation.

You can read more about effect handlers in OCaml in the full paper.

Why is there no syntactic support for effect handlers in OCaml 5.0?

Effect handlers currently in Multicore OCaml do not ensure effect safety. That is, the compiler will not ensure that all the effects performed by the program are handled. Instead, unhandled effects lead to exceptions at runtime. Since we plan to extend OCaml with support for an effect system in the future, OCaml 5.0 will not feature the syntactic support for programming with effect handlers. Instead, we expose the same features through functions from the standard library, reserving the syntax decisions for when the effect system lands. The function based effect handlers is just as expressive as the current syntaxful version in Multicore OCaml. As an example, the syntax-free version of:

effect E : string 

let comp () =
  print_string "0 ";
  print_string (perform E);
  print_string "3 "
     
let main () = 
  try 
    comp () 
  with effect E k -> 
    print_string "1 "; 
    continue k "2 ";  
    print_string “4 " 

will be:

type _ eff += E : string eff
     
let comp () = 
  print_string "0 "; 
  print_string (perform E); 
  print_string "3 "
     
let main () =
  try_with comp () 
  { effc = fun e -> 
      match e with 
      | E -> Some (fun k ->  
          print_string "1 ";
          continue k "2 "; 
          print_string “4 “)
      | e -> None }

One can imagine writing a ppx extension that enable programmers to write code that is close to the earlier version.

Which opam switch should I use today?

The 4.12+domains opam switch has all the features that will go into OCaml 5.0, including the effect-handlers-as-functions. The exact module under which the functions go will likely change by 5.0, but the basic form should remain the same.

The 4.12+domains+effects opam switch will be preserved, but the syntax will not be upstreamed. This switch is mainly useful to try out the examples of OCaml effect handlers in the academic literature.

To learn more about programming using this effect system, see the eio library and this recent talk. In the next few weeks, the eio library will be ported to 4.12+domains to use the function based effect handlers so that it is ready for OCaml 5.0.

Onto the September 21 update

A number of enhancements have been merged to improve the thread safety of the stdlib, improve the test suite coverage, along with the usual bug fixes. The documentation for the ecosystem projects has been updated for readabilty, grammar and consistency. The sandmark-nightly web service is currently being Dockerized to be deployed for visualising and analysing benchmark results. The Sandmark 2.0-beta branch is also released with the 2.0 features, and is available for testing and feedback.

We would like to acknowledge the following people for their contribution:

  • @lingmar (Linnea Ingmar) for reporting a segmentation fault in 4.12.0+domains at caml_shared_try_alloc.
  • @dhil (Daniel Hillerström) provided a patch to remove drop_continuation in the compiler sources.
  • @nilsbecker (Nils Becker) reported a crash with 14 cores when using Task.pool management.
  • @cjen1 (Chris Jensen) observed and used ulimit to fix a Unix.ENOMEM error when trying out the Eio README example.
  • @anuragsoni (Anurag Soni) has contributed an async HTTP benchmark for retro-httpaf-bench.

As always, the Multicore OCaml updates are listed first, which are then followed by the updates from the ecosystem tools and libraries. The Sandmark-nightly work-in-progress and the Sandmark benchmarking tasks are finally listed for your reference.

Multicore OCaml

Ongoing

Thread Safe

Segmentation Fault

  • ocaml-multicore/ocaml-multicore#639
    Segfaults in GC

    An ongoing investigation on the segmentation fault caused at
    caml_shared_try_alloc in 4.12.0+domains as reported by @lingmar
    (Linnea Ingmar) .

  • ocaml-multicore/ocaml-multicore#646
    Coq segfaults during build

    The Coq proof assistant results in a segmentation fault when run
    with Multicore OCaml, and a new tarball has been provided for
    testing.

Test Suite

  • ocaml-multicore/ocaml-multicore#640
    GitHub Actions for Windows

    The GitHub Actions have been updated to run the Multicore OCaml test
    suite on Windows.

  • ocaml-multicore/ocaml-multicore#641
    Get the multicore testsuite runner to parity with stock OCaml

    The Multicore disabled tests need to be reviewed to see if they can
    be re-enabled, and also run them in parallel, similar to trunk.

Sundries

  • ocaml-multicore/ocaml-multicore#637
    caml_page_table_lookup is not available in ocaml-multicore

    The ancien package uses Is_in_heap_or_young macro which
    internally uses caml_page_table_lookup that is not implemented yet
    in Multicore OCaml.

  • ocaml-multicore/ocaml-multicore#653
    Drop drop_continuation

    A PR contributed by @dhil (Daniel Hillerström) to remove
    drop_continuation since clone_continuation has also been
    removed.

Completed

Upstream

  • ocaml-multicore/ocaml-multicore#631
    Don’t raise asynchronous exceptions from signals in caml_alloc C functions

    A PR that prevents asynchronous exceptions being raised from signal
    handlers, and avoids polling for pending signals from caml_alloc_*
    calls from C.

  • ocaml-multicore/ocaml-multicore#638
    Add some injectivity annotations to the standard library

    The injectivity annotations have been backported to stdlib from
    4.12.0 in order to compile stdcompat with Multicore OCaml.

  • ocaml-multicore/ocaml-multicore#642
    Remove the remanents of page table functionality

    Page tables are not used in Multicore OCaml, and the respective
    macro and function definitions have been removed.

  • ocaml-multicore/ocaml-multicore#643
    Core_kernel minor words report are off

    The report of allocated words are skewed because the young_ptr and
    young_end are defined as char *. The PR to change them to value * has now been merged.

  • ocaml-multicore/ocaml-multicore#652
    Make young_start/end/ptr pointers to value

    The young_start, young_end, and young_ptr use in Multicore
    OCaml has been updated to value * instead of char * to align
    with trunk.

Backports

Thread Safe

  • ocaml-multicore/ocaml-multicore#630
    Make signals safe for Multicore

    The signals implementation has been overhauled in Multicore OCaml
    with clear and correct semantics.

  • ocaml-multicore/ocaml-multicore#635
    Make lib-str domain safe

    The PR moves the use of global variables in str to thread local
    storage. A test case that does str computations in parallel has
    also been added.

Effect Handlers

  • ocaml-multicore/ocaml-multicore#650
    Add primitives necessary for exposing effect handlers as functions

    The inclusion of primitives to facilitate updates to 4.12+domains
    to continue to work with changes from 4.12+domains+effects.

  • ocaml-multicore/ocaml-multicore#651
    Expose deep and shallow handlers as functions

    The PR exposes deep and shallow handlers as functions in the Obj
    module. It also removes the ability to clone continuations.

Sundries

  • ocaml-multicore/ocaml-multicore#633
    Error building 4.12.0+domains with no-flat-float-arrays

    The linker error has been fixed in
    PR#644.

  • ocaml-multicore/ocaml-multicore#647
    Improving Multicore’s issue template

    The Multicore OCaml bug report template has been improved with
    sections for Describe the issue, To reproduce, Multicore OCaml build version, Did you try running it with the debug runtime and heap verificiation ON?, and Backtrace.

Ecosystem

Ongoing
  • ocaml-multicore/domainslib#43
    Possible bug in Task.pool management

    A segmentation fault on Task.pool management when using 14 cores as
    reported by @nilsbecker (Nils Becker).

  • ocaml-multicore/multicore-opam#59
    Fix batteries after ocaml-multicore/ocaml-multicore#514

    Update the batteries.3.3.0+multicore opam file for
    batteries-included with the correct src URL.

  • ocaml-multicore/multicore-opam#60
    Multicore domains+effects language server does not work with VS Code

    A Request textDocument/hover failed error shows up with VS Code
    when using Multicore domains+effects language server.

  • ocaml-multicore/eio#81
    Is IO prioritisation possible?

    A query on IO prioritisation and on scheduling of fibres for
    consensus systems.

Completed
Build
  • ocaml-multicore/eventlog-tools
    Use ocaml/setup-ocaml@v2

    The GitHub workflows have now been updated to use 4.12.x
    ocaml-compiler and ocaml/setup-ocaml@v2 in
    .github/workflows/main.yml file.

  • ocaml-multicore/tezos#3
    Add cron job and run tests

    The CI Dockerfile and GitHub workflows have been changed to run the
    tests periodically for Tezos on Multicore OCaml.

  • ocaml-multicore/tezos#4
    Run cronjob daily

    The GitHub cronjob is now scheduled to run daily for the Tezos
    builds from scratch.

  • ocaml-multicore/retro-httpaf-bench#12
    Dockerfile fails to build

    The issue no longer exists, and the Dockerfile now builds fine in
    the CI as well.

  • ocaml-multicore/eio#80
    ENOMEM with README example

    @cjen1 (Chris Jensen) reported a Unix.ENOMEM error that prevented
    the following README example code snippet from execution. Using
    ulimit with the a smaller memory size fixes the same.

    #require "eio_main";;
    open Eio.Std;;
    
    let main ~stdout = Eio.Flow.copy_string "hello World" stdout
    Eio_main.run @@ fun env -> main ~stdout:(Eio.Stdenv.stdout env)
    ;;
    
Documentation
  • ocaml-multicore/parallel-programming-in-multicore-ocaml#10
    Edited for flow/syntax/consistency

    The Parallel Programming in Multicore OCaml chapter has been
    reviewed and updated for consistency, syntax flow and readability.

  • ocaml-multicore/eio#79
    Initial edits for consistency, formatting and clarity

    The README in the Eio project has been updated for consistency,
    formatting and readability.

  • The ocaml2020-workshop-parallel
    README has been updated with reference links to books, videos,
    project repository, and the OCaml Multicore wiki.

Benchmarks

Benchmarking

Sandmark-nightly

Ongoing

  • ocaml-bench/sandmark-nightly#10
    Dockerize sandmark-nightly

    The sandmark-nightly service needs to be dockerized to be able to
    run on multiple machines.

  • ocaml-bench/sandmark-nightly#11
    Refactor the sandmark-nightly notebooks

    The code in the sandmark-nightly notebooks need to be refactored and
    modularized so that they can be reused as a library.

  • ocaml-bench/sandmark-nightly#12
    Normalization graphs (with greater than two benchmarks) needs to be fixed

    The normalization graphs only produce one coloured bar group even if
    there are more than two benchmarks. It needs to show more than one
    coloured graph when compared with the baseline.

  • ocaml-bench/sandmark-nightly#13
    Store the logs from the nightly runs along with the results

    The nightly run logs can be stored as they are useful for debugging
    any failures.

  • ocaml-bench/sandmark-nightly#14
    Add best-fit variant to sequential benchmarks

    The sandmark-nightly runs should include the best-fit allocator as
    it is better than the next-fit allocator. The best-fit allocator can
    be enabled using the following command:

    $ OCAMLRUNPARAM="a=2" ./a.out
    
  • ocaml-bench/sandmark-nightly#16
    Cubicle and Coq benchmarks are missing from the latest navajo nightly runs

    The UI for the sequential benchmarks fail to load normalized graphs
    because of missing Cubicle and Coq benchmark .bench files.

  • ocaml-bench/sandmark-nightly#17
    Navajo runs are on stale Sandmark

    The Sandmark deployed on navajo needs to be updated to the latest
    Sandmark, and the git pull is failing due to uncommitted changes
    to the Makefile.

Sandmark

Ongoing

  • ocaml-bench/sandmark#248
    Coq fails to build

    A new Coq tarball to build with Multicore OCaml is now available for
    testing at
    coq-multicore-2021-09-24.

  • Sandmark 2.0-beta

    The Sandmark
    2.0-beta
    branch is now available for testing. It includes new features such
    as package override option, adding meta-information to the benchmark
    results, running multiple iterations, classification of benchmarks,
    user configuration, and simplifies package dependency
    management. You can test the branch for the following OCaml compiler
    variants:

    • 4.12.0+domains
    • 4.12.0+stock
    • 4.14.0+trunk
    $ git clone https://github.com/ocaml-bench/sandmark.github
    $ cd sandmark
    $ git checkout 2.0-beta
    
    $ make clean; TAG='"run_in_ci"' make run_config_filtered.json
    $ RUN_CONFIG_JSON=run_config_filtered.json make ocaml-versions/4.12.0+domains.bench
    
    $ make clean; TAG='"run_in_ci"' make run_config_filtered.json
    $ RUN_CONFIG_JSON=run_config_filtered.json make ocaml-versions/4.12.0+stock.bench
    
    $ make clean; TAG='"run_in_ci"' make run_config_filtered.json
    $ RUN_CONFIG_JSON=run_config_filtered.json make ocaml-versions/4.14.0+trunk.bench
    
    $ make clean; TAG='"macro_bench"' make multicore_parallel_run_config_filtered.json
    $ RUN_BENCH_TARGET=run_orunchrt BUILD_BENCH_TARGET=multibench_parallel RUN_CONFIG_JSON=multicore_parallel_run_config_filtered.json make ocaml-versions/4.12.0+domains.bench
    

    Please report any issues that you face in our GitHub
    project
    page.

Completed

  • ocaml-bench/sandmark#251
    Update dependencies to work with 4.14.0+trunk

    The Sandmark master branch dependencies have now been updated to
    build with 4.14.0+trunk.

  • ocaml-bench/sandmark#253
    Remove Domain.Sync.poll() from parallel benchmarks

    The Domain.Sync.poll() function call is now deprecated and the same
    has been removed from the parallel benchmarks in Sandmark.

  • ocaml-bench/sandmark#254
    Disable sandboxing

    The --disable-sandboxing option is now passed as default to opam
    when setting up the local _opam directory for Sandmark builds.

We would like to thank all the OCaml users, developers and contributors in the community for their continued support to the project. Stay safe!

Acronyms

  • CI: Continuous Integration
  • CPU: Central Processing Unit
  • GC: Garbage Collector
  • HTTP: Hypertext Transfer Protocol
  • IO: Input/Output
  • OPAM: OCaml Package Manager
  • PR: Pull Request
  • UI: User Interface
  • URL: Uniform Resource Locator
  • VS: Visual Studio
47 Likes

In multicore the OCaml runtime will be much more sophisticated than before. Inevitably there will be bugs in user code due to data structures being accessed by threads running parallely.

What is the upcoming OCaml debugging story? Firing up the native executable and debugging using gdb/rr is going to be difficult because all you will see is low level stuff.

I know there was a project a few years ago by @mshinwell to improve support of OCaml in gdb but that project was never merged into trunk. How will people debug OCaml multicore programs more easily? Are there plans to revive improved gdb support for OCaml?

4 Likes

Firstly, congratulations to all of the people working on multicore (and effects), this feels like a huge and well-deserved milestone :tada: !

Secondly, I think I remember reading some of the graphs included where a little inaccessible, so I’m adding a comment with hopefully an alternative version for people who struggle to distinguish between the colours and a textual description of the general trends as I see it.

Benchmarks are shown for different http server implementations plotting load requests per second on the x-axis against serviced requests per second on the y-axis. The implementations include one in Rust using the Hyper framework, an OCaml implementation using httpaf, async and shuttle, an OCaml implementation using httpaf with eio, another httpaf implementation using effects, a httpaf implementation with lwt, one in Go using Net, and a cohttp-lwt OCaml one. This is the same order as the final performance hierarchy shows.

A brief description of the plot follows. By the time we reach 140,000 loads per second, the graph shows the Rust implementation servicing around 140,000 requests per second with no sign of slowing down at this stage. The httpaf-shuttle-async implementation is at around 120,000 serviced requests at this stage having started to underperform Rust at around 100,000 loads per second. Httpaf-eio is just below shuttle at around 100,000 serviced requests and diverged from Rust at the same point. Httpaf-effects is at around 90,000 serviced requests but started under performing at just before 80,000 loads per second. Httpaf-lwt has plateaued at around 60,000 serviced requests with the Go implementation fractionally below that both having started to under perform around 50,000 loads per second. Finally cohttp-lwt made it to around 30,000 load requests per second before plateauing immediately at the same amount.

8 Likes

I have not tried using this myself but could the VS Code debug adapter for OCaml be a viable solution?

This is quite exciting. When is OCaml 5 expected to be released?

1 Like

Just thinking out loud here but, if effects are a generalized form of exceptions, and there is a plan to, at some point, for the compiler to ensure effect safety–then does that mean that effectively it will also ensure exception safety at compile time? I.e. a more advanced form of Type-based analysis of uncaught exceptions.

2 Likes

I have not tried using this myself but could the VS Code debug adapter for OCaml be a viable solution?

The debug adapter that you link to is a vscode extension to interoperate with the ocamldebug program that comes with OCaml. ocamldebug is a command line debugger for Ocaml bytecode programs.

The questions OTOH I have are (a) Are there plans to improve support for debugging native multicore OCaml executables under gdb (basically better OCaml specific support while using gdb. Just like gdb provides decent support while debugging rust for instance) ? (b) Relatedly, what is ocamldebug’s future for debugging bytecode under multicore?

1 Like

The multicore effort is all on track for integration into OCaml 5.0 early next year

The above according to Multicore OCaml: August 2021 at least.

We’ve gone to some effort to preserve DWARF unwinding correctly in multicore OCaml (see the effects paper for more details). You may also want to check the debugging tips and tricks in the OCaml multicore wiki which has info on how to use gdb and rr. You do get your functions back as mangled names, but it’s pretty easy to visually map those back to their original OCaml function names by inspection.

6 Likes

Secondly, I think I remember reading some of the graphs included where a little inaccessible, so I’m adding a comment with hopefully an alternative version for people who struggle to distinguish between the colours and a textual description of the general trends as I see it.

Thanks a lot for that. As a colorblind person, I really appreciate it. As a way to partially automate that, there are markers with matplotlib. I’ve taken a look at the repo, and I believe that this plot is generated on the rps_df.plot(xlabel="load requests/second", ylabel="serviced requests/second") line. If that’s the case, here’s to add markers to it:

markers = ["o", "v", "^", "s", "D", "p", "|"]

plot = rps_df.plot(xlabel="load requests/second", ylabel="serviced requests/second")
for (line, marker) in zip(plot.get_lines(), markers):
    line.set_marker(marker)

plot.legend(plot.get_lines(), rps_df.columns)

I haven’t been able to run the benchmarks on my machine, so I can’t show you the real version, but here’s a minimal example with the result:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

d = { 'ex1': [1, 2, 3, 4, 5], 'ex2': [1, 2.5, 3.6, 5, 7], 'ex3': [1, 2, 2.5, 3, 3.5] }
df = pd.DataFrame(d)

markers = ["o", "v", "^"]

plot = df.plot(xlabel="a label", ylabel="another label")
for line, marker in zip(plot.get_lines(), markers):
    line.set_marker(marker)

plot.legend(plot.get_lines(), df.columns)

output

5 Likes

Which opam switch should I use today?

Is the 2nd snippet supposed to compile on 4.12+domains, or is the effects-as-fns pseudo code? Forgive me if I misread the above update. :slight_smile:

2 Likes

Yes, 4.12.0+domains is good and should compile the snippet (modulo some ommited opens).

See this example from the testsuite: Expose deep and shallow handlers as functions by kayceesrk · Pull Request #651 · ocaml-multicore/ocaml-multicore · GitHub

3 Likes

Appreciate your reply. I’m using a opam 4.12+domains switch. However, I am unable to successfully open Obj.Effect_handlers and open Obj.Effect_handlers.Deep. I get:

1 | open Obj.Effect_handlers
         ^^^^^^^^^^^^^^^^^^^
Error: Unbound module Obj.Effect_handlers

Is there a library I need to add to my dune file?

P.S. I investigated this some more – the PR Expose deep and shallow handlers as functions by kayceesrk · Pull Request #651 · ocaml-multicore/ocaml-multicore · GitHub that is linked to was merged on 22 Sep 2021. But the opam switch 4.12+domains that I have on my local machine is older than that. I realize the switch is based on the multicore git repository and not a frozen archive file!

opam switch reinstall

(probably a good idea to run opam update before issuing this call btw)

Fixes the problem!

Now I’m going to try to get the example working.

3 Likes

Thanks! What a treat. I tried it out on a toy project and it was a delight. Excellent work, all.

2 Likes

Hi All,

The multicore eio lib has now been migrated to the syntax-free version. This is the version of effects which will be available in the upcoming OCaml 5.0.0.

PR : Migrate to 4.12.0+domains effects implementation(syntax-free effects version) by bikallem · Pull Request #82 · ocaml-multicore/eio · GitHub

Warm regards
Bikal

14 Likes

I started a ppx. Not polished in the slightest, but compiles and does the basics.

let comp () = perform (A ()) in (* snip snip snip *)
let () =
  let handle_a _ continue k = continue k () in
  let handle_b _ continue k = continue k () in
  [%with_effects comp () [| A handle_a; B handle_b; |]]

Demo link. Runnable, emits semi-interesting output :slight_smile:

4 Likes

I opened a PR to support try .. with .. as well:

  try%effect comp on_complete with
  | A (s, k) -> handle_a s k
  | B (s, k) -> handle_b s k
  | C (s, k) -> handle_b s k
8 Likes

Very cool! Great to see more PPX authors out there.

You may be interested in CraigFe/ppx_effects, which I’ve been tinkering with for a few days. It has the same goal, but makes some different API choices. We should join forces! :tada:

7 Likes

Super cool. I ended up swapping my impl fully with @Gopiandcode’s impl. I’d be interested in to cohack/maintain the pkg. If you have a community repo in mind, please share! Otherwise, I’d be happy to seed one over in the Dino-DNA GitHub org.

Call the ball :slight_smile:

4 Likes