Multicore OCaml: May 2020
Welcome to the May 2020 update from the Multicore OCaml team! As with previous updates, many thanks to @shakthimaan and @kayceesrk for help assembling this month’s roundup.
A major milestone in May 2020 has been the completion of rebasing of Multicore OCaml all the way from 4.06 to 4.10! The Parallel Minor GC variant that performs stop-the-world parallel minor collection is the default branch for the compiler, which means that compatibility with C bindings is now much simpler than with the older minor GC design.
I’ve received many questions asking if this means that multicore OCaml will “just work” with the opam ecosystem now. Not quite yet: we estimate that we are now one PR away from this working, which requires that the existing
Threads module is backported to multicore OCaml to support the older (non-parallel-in-the-runtime but concurrent) uses of threading that existing OCaml supports. This effort was begun a year ago by @jhw in #240 and now rebased and being reviewed by @engil in #342. Once that is merged and tested by us on a bunch of packages and bulk builds, we should be good to start using Multicore OCaml with opam. Stay tuned for more on that next month!
The ongoing and completed tasks for the Multicore OCaml are listed first, which are then followed by improvements to the Sandmark benchmarking project. Finally, the status of the contributions to upstream OCaml are mentioned for your reference. This month has also seen a meeting of the core OCaml runtime developers to assign post-rebasing tasks (such as also porting statmemprof, how to handle non-x86 architectures, Windows support, etc) to ensure a more complete view of the upstreaming tasks ahead. The task list is long, but steadily decreasing in length.
As to how to contribute currently, there is an incredibly exciting seam of work that has now started on the appropriate programming abstractions to support parallel algorithms in OCaml. See this thread for more on that, and also on the Domainslib repository for more low-level examples of traditional parallel algorithms. In a month or so, we expect that the multicore switch will also be more suitable for use with opam, but don’t let that stop you from porting your favourite parallel benchmark to Domainslib today.
Proposal for domain-local storage
A new proposal for implementing a domain-local storage in Multicore
OCaml has been created.
Task library slowdown if the number of domains is greater than 8
This is an ongoing investigation on why there is a slowdown with
domainslibversion 0.2 for the Game of Life benchmark when the
number of domains is greater than eight.
Fix Atomic.exchange in concurrent_minor_gc
An implementation is provided for
Atomic.compare_and_setto obtain the correct
semantics to handle assertion failure in interp.c.
Introduce Lazy.try_force and Lazy.try_force_val
implemented for concurrent lazy abstractions to handle the RacyLazy
Random module functions slowdown on multiple cores
There is an observed slowdown for the
Randommodule on multiple
cores, and the issue is being analysed in detail.
Fix extcall noalloc DWARF
The patch provides a fix for the emitted DWARF information for
extcall noalloc. This PR is currently under review.
Update opam file to 4.10.0+multicore
The rebasing of Multicore OCaml to 4.11 branch
parallel_minor_gc_4_11) point is now complete! The
file for 4.10.0+multicore has been made the default in the
Add byte_domain_state.tbl to install files
A patch to install
now been included in the runtime/Makefile which is required for
The Multicore OCaml major GC implementation verification using the
SPIN model checker is available at the following GitHub repository
Task API Port: LU-Decomposition, Floyd Warshall, Mandelbrot, N-body
Porting of the following programs - LU-Decomposition, Floyd
Warshall, Mandelbrot and N-body to use the Task API.
Make benchmark wrapper user configurable
The ability to dynamically specify the input commands and their
respective arguments to the benchmark scripts is currently being
Promote dune > 2.0
Sandmark works with dune 1.11.4 and we need to support dune greater
than 2.0 moving forward. The upgrade path with the necessary package
builds is being tested.
An interactive notebook to run and analyse sequential benchmarks has
been included. Given an artifacts directory with the benchmark
files, the notebook prompts you in the GUI to select different
commit and compiler variants for analysis. A sample screenshot of
the UI is shown below:
The PR adds error handling, user input validation and the project
README has also been updated.
Add parallel initialisation and parallel copy to LU decomposition benchmark
The parallel initialisation is now added to LU decomposition
Use --format=columns with pip3 list in Makefile
A fix for the “DEPRECATION: The default format will switch to
columns in the future” warning when using
pip3 listhas now been
added to the Makefile with the use of the
Use sudo for parallel benchmark builds
The Makefile has been updated with the right combination of
and OPAM environment variables so that we can now run parallel
benchmarks in Sandmark. The sudo command is required exclusively for
chrtcommand. We can now perform nightly builds for both
serial and parallel benchmarks!
Refactored README and added JupyterHub info
The Sandmark README file has now been updated to include information
on configuration, usage of JupyterHub, benchmarking and a quick
Add manual page for the instrumented runtime
A draft manual for the instrumented runtime eventlog tracing has
been created. Please feel free to review the document and share your
Support building executables against OCaml 4.11 instrumented runtime
OCaml 4.11.0 has built-in support for the instrumented runtime, and
it will be useful to have dune generate instrumented targets.
Eventlog tracing system
The Eventlog tracing proposal for the OCaml runtime that uses the
Binary Trace Format (CTF) is now merged with upstream OCaml
[RFC] Dynamic check for naked pointers
An RFC for adding the ability to dynamically identify naked pointers
in the 4.10.0 compiler.
Reimplement Unix.create_process and related functions without Unix.fork
The use of process creation functions in the Unix module is not
suitable for Multicore OCaml, for both behaviour and efficiency. The
patch provides an implementation that uses
Add a macro for out-of-heap block header
This PR adds a macro definition to construct a out-of-heap block
header in runtime/caml/mlvalues.h. The objective is to use the
header for out of heap objects.
As always, we would like to thank all the OCaml developers and users for their continued support and contribution to the project. Stay safe out there.
- API: Application Programming Interface
- CTF: Common Trace Format
- DWARF: Debugging With Attributed Record Formats
- GC: Garbage Collector
- GUI: Graphical User Interface
- LU: Lower-Upper
- OPAM: OCaml Package Manager
- PIP: Pip Installs Python
- PR: Pull Request
- RFC: Request for Comments
- UI: User Interface