OCaml compiler development newsletter, issue 5: November 2021 to February 2022

I’m happy to publish the fifth issue of the “OCaml compiler development newsletter”. You can find all issues using the tag compiler-newsletter.

Note: the content of the newsletter is by no means exhaustive, only a few of the compiler maintainers and contributor had the time to write something, which is perfectly fine.

Feel free of course to comment or ask questions!

If you have been working on the OCaml compiler and want to say something, please feel free to post in this thread! If you would like me to get in touch next time I prepare a newsletter issue (some random point in the future), please let me know by Discuss message or by email at (gabriel.scherer at gmail).

Context

The last issue (October 2021) corresponded to the last development period before merging the Multicore OCaml implementation and the “Sequential freeze” (a freeze on non-multicore-related changes to facilitate the Multicore merge).

Since then there has of course been a massive amount of work by the Multicore team (see the massive Multicore newsletter for January 2022). The upstream development pace has been unusual: there is less non-multicore activity than before (last-minute changes for the 4.14 release, and long-running projects moving on in parallel), there is a fair amount of work cleaning up things that were broken by the Multicore merge, with an influx of new contributors and also some non-new contributors which are still complete beginners with respect to the Multicore runtime.

Things are moving along at a reasonable pace, and we expect to release 5.0 at some point :slight_smile:

Individual reports

@gasche Gabriel Scherer

Shapes

In #10718, @voodoos Ulysse Gérard, @trefis Thomas Refis and @lpw25 Leo White proposed “shapes”, a new form of static information about OCaml modules that would be computed and stored by the OCaml compiler to help other tools, in particular Merlin, work with OCaml names/definitions. (This is entirely their work, not mine!) The work was merged in the 4.14 development version.

After the merge, opam-wide testing by @kit-ty-kate Kate Deplaix revealed performance issues on some functor-heavy code, in particular Irmin. Shape computation in the compiler would blow up, in computation time and/or size of the generated build artifacts.

The core mechanism of “shapes” is an evaluator for lambda-terms. In #10825 I worked on a more efficient evaluator (with help from @Edkohibs Nathanaëlle Courant), using relatively advanced machinery (strong call-by-need reduction), and it gives good results in practice – there are no known cases of time or size blowup anymore. There was a lot of back-and-forth between different design and implementation choices, and additional testing by Ulysse helped a lot! Finally we had an in-person review meeting with Ulysse, Thomas, @Octachron Florian Angeletti and Nathanaëlle, and it was a lot of fun, especially in these times of low in-person activity.

GADT and pattern generalization

In #1097, @dongyan reported a soundness bug in the OCaml type-checker due to the interaction of polymorhpism (generalisation) and GADT existential types in the type inference of pattern-matching. We analyzed the issue and I proposed a restriction of the typing rule to reject unsound examples; @lpw25 Leo White refined the proposal further. I tried to implement the fix/restriction myself, but it’s not easy when one is not familiar with the OCaml codebase. Jacques Garrigue proposed a full implementation in #10907, which is now merged. (This implements only my initial criterion, not Leo’s refinement, which is harder to implement within the current type inference implementation for patterns.)

Beginner-level Multicore hacking

These days, OCaml maintainers are gently encouraged into working on the Multicore-related post-merge tasks, instead of slacking off working on cool optimisations or type system bugs. (The two tasks above have the excuse that they helped preparing the 4.14 release.) But most people knew nothing about the Multicore runtime until a few months ago, so everyone is a beginner here!

I worked on small refactoring or minor bugfixes as I spotted them reading the code, with three larger pieces:

  1. #10887: I kept working with @xavierleroy on the Domain.DLS interface to let OCaml libraries store per-domain global state; now it’s possible to create per-domain state that is “inherited” on Domain.spawn (the child state is computed from the parent state). This was a necessary building block to replace the Random implementation by a splittable random number generator, which was finally done in #10742 by Xavier using LXM as planned.

  2. #10971: this issue by @sabine Sabine Schmaltz is discussing how to change the size of the reserved memory address space used for minor heaps, when the user asks for larger minor heaps or for more domains. (Each domain has its own minor heap, but they are contiguous for fast Is_young checking.) Sabine and I are working on an implementation. I sent various preparation PRs such as #10974 (changing the mechanism to compute unique domain identifiers, which relied on a fixed Max_domains limit).

  3. gc stats (#11008, #11047): the code to compute GC statistics needs some love, it changed significantly in the Multicore runtime but is also trying to preserve the interface exposed by the previous GC, and some things are slightly wrong. I started reading the code from the Max_domains angle (it is storing per-domain statistics in a fixed-sized array), but ended up working on it with help from @Engil Enguerrand Decorne.

@xavierleroy Xavier Leroy

On January 10th 2022, I had the privilege to push the “merge” button on pull request #10831, thus bringing Multicore OCaml into the OCaml “trunk” and giving birth to OCaml 5.

Before and after this glorious moment, the Multicore OCaml development team, the other reviewers and I have been spending considerable amounts of time studying and reviewing the Multicore OCaml sources, reviewing the big pull request, and fixing the issues that remain after the merge. I also spent much time reworking our Jenkins CI system to handle OCaml 5 and adapting the test suite accordingly.

The transition to OCaml 5 is also a great opportunity to remove long-obsolete features and simplify the code base. For example, in #10898, I was able to remove a lot of cruft for signal handling that is no longer relevant now that signal handlers no longer run immediately on receiving the signal and stack overflow is explicitly managed by ocamlopt-generated code. Another example is #10935, which deprecates the Thread.exit function and offers a simpler, exception-based mechanism for early thread termination.

Last but not least, I was finally able to merge my reimplementation of the Random module on top of the lovely LXM pseudo-random number generator (#10742), thanks to @gasche’s work on domain-local state.

@Octachron Florian Angeletti

Benchmarking compilation time and compilation artefact size for shapes

As described by @gasche, OCaml 4.14 introduces a new kind of metadata called shape. The computation of shapes metadata has some cost in term of compilation time. During the initial review, I had completely underestimated that cost, which lead to the compilation time blow up reported by irmin. When it was time to fix that mistake, I wanted to have a more global view of the effect of the shape computation on a significant slice of the opam repository.
Thus I measured both the compilation time and the size of compilation artefacts with and without the shape computation for over a thousand of opam packages (the half of the opam repository with the most reverse dependencies).
Fortunately, this performance measurement campaign concluded that for 90% of source files, the increase of compilation time was less than 10%:

percentile relative compilation time increase
1% -9%
10% 0%
25% 0%
50% 0%
75% +4%
90% +10%
99% +20%
99.9% +32%
99.99% +46%

Documentation tags for a new era

With Multicore OCaml, OCaml libraries will need to document how safe they are to use in a multicore context, in particular if they use some global state. To make it as easy as possible to document that point, I have started to implement new ocamldoc tags for multicore safety in #10983 . However, the developers of odoc weren’t keen on the idea of adding more tags, and proposed that this information should be conveyed in attributes that could be lifted to the documentation. @julow Jules Aguillon proposed an implementation in #11024, that I have reviewed. This change by itself improves the state of alert in the documentation which is already a great improvement and should allow us to reach better design for multicore-safety documentation later on.

Beginner-level multicore hacking

To participate a little to the stabilisation of OCaml 5.0, I spend some time to make the Dynlink library thread-safe in the easiest way: by a adding a global lock to the library. This is one of the few cases where a adding global lock to an existing library makes sense since we simultaneously don’t expect call to Dynlink function to be performance critical and really don’t want race condition in Dynlink to corrupt the state of the whole program.

@garrigue Jacques Garrigue

Cleaning up variance computation

For many, the computation of variances in type definitions is quite mysterious. The only available specification is a short abstract at the OCaml Meeting 2013, and it is actually incomplete. With Takafumi Saikawa, I took a fresh look at the problem and we came up with a clearer lattice, and more explicit algorithms. This is still a draft PR #11018.

Separate typing of counter-examples from type_pat

Since the introduction of GADTs, type_pat, the function that types patterns, is also used to discard impossible cases during the exhaustiveness check. Moreover, it was later turned into continuation-passing style to allow backtracking, in order to check type inhabitation. Originally this seemed a good idea, allowing to factorize much code, but then new functionality was added to type_pat, and the two roles started to diverge. #11027 is another draft PR that separates them, reverting type_pat to direct style, and adding a new retype_pat function which takes as input a partially typed tree. Interestingly, this de-factorization actually reduces the code size by more than a hundred lines.

Coqgen

The experimental Gallina generating backend is still progressing albeit slowly.
For those intrested there are now slides describing it.

@nojb Nicolas Ojeda Bär

We are taking advantage of the 5.0 release to get rid of a lot of cruft that had built up over time:

  • #10897: Remove everything officially marked as deprecated before the 5.0 release.
  • #10863: Remove the <caml/compatibiliy.h> header which contained the definition of some runtime symbols without the caml_ prefix for compatibility.
  • #10896: Remove Stream, Genlex and Pervasives from the standard library, as well as the standalone bigarray library (no longer needed since the Bigarray module was moved into the standard library).
  • #11002: No longer use the Begin_roots/End_roots macros in the runtime system. These were not yet removed because they are used by some external projects (eg camlidl).
  • #10922, #10923: Add deprecation attributes to some symbols that did not have them so that we can remove them in some future release.

Apart from all this spring cleaning, a small addition to the standard library that I had missed for a long time:

  • #10986: Add Scanf.scanf_opt, Scanf.sscanf_opt and Scanf.bscanf_opt option-returning variants.

@dra27 David Allsopp

mingw-w64 port for native Windows OCaml 5

I did the original port of multicore OCaml way back in 2018 (see Discuss post); that got rebased to 4.10 and updated to include native code support during summer 2020, but the testing story wasn’t quite there. However, it got updated in the autumn and was merged just-in-time for the main PR to ocaml/ocaml! At the moment, we only support the mingw-w64 port on Windows: OCaml 4.x has a hand-crafted implementation of all the required pthreads primitives in the systhreads library, but for 5.x we’re, at least for now, using the winpthreads library from the mingw-w64 project.

In the meantime, the Cygwin64 port is mostly broken for slightly complicated reasons and the MSVC port isn’t working from a combination of a lack of C11 atomics support and the aforementioned winpthreads library which is in mingw-w64. It’s extremely unlikely that the MSVC port will be ready for OCaml 5.0, but I have got a version of it just about working using C++ atomics and a manual build of the winpthreads library using Visual Studio, so there is light at the end of the tunnel for the MSVC port, hopefully for OCaml 5.1!

FlexDLL updates for Visual Studio

Recent changes in the Windows SDK required some alterations in the flexlink linker, used in both the native Windows ports and the Cygwin port of OCaml. This project got a little bit of merging attention and a release, so recent updates of Visual Studio 2017 and 2019 and also Visual Studio 2022 now successfully build OCaml 4.x, even with the Windows 11 SDK.

Native toplevel library in 4.14

I helped to upstream some work done by others, and also narrowed the differences between ocaml and ocamlnat for 4.14. In OCaml 4.14, the native toplevel library is now always installed, which paves the way for native code programs which wish to interpret toplevel statements (e.g. a native version of the mdx tool). Part of this work also inserted some hooks into the toplevel which allow the external linker to be replaced with a dynamic code emitter (so-called “ocaml-jit”, but the use of JIT seems to cause a lot of confusion - the point is that the overhead of repeatedly calling an external assembler is removed, which is measurable when compiling lots of individual phrases).

glibc 2.34 back-ports

glibc 2.34 includes a change to the way alternate stacks for the signal handlers must be allocated. The issue was partially fixed in 4.13, but as newer distributions started to ship with this newer glibc, the change in behaviour for older OCaml versions was becoming problematic - for example, you couldn’t install OCaml 4.12 or earlier on the latest release of Ubuntu or Fedora. Xavier worked on the second half of the fix already in 4.13 (to deallocate the alternate stack on termination), and I managed the process of back-porting it to all the versions of OCaml in opam-repository (3.07+!). We also took the decision to push these patches to the old release branches on GitHub, partly as a convenient place to store them, since opam-repository references them from there, and partly as it allows the older branches still to be compiled directly from a git clone.

24 Likes