OCaml compiler development newsletter, issue 6: March 2022 to September 2022

gasche · November 8, 2022, 9:31pm

I’m happy to publish the sixth issue of the “OCaml compiler development newsletter”. You can find all issues using the tag compiler-newsletter .

Note: the content of the newsletter is by no means exhaustive, only a few of the compiler maintainers and contributor had the time to write something, which is perfectly fine.

Feel free of course to comment or ask questions!

If you have been working on the OCaml compiler and want to say something, please feel free to post in this thread! If you would like me to get in touch next time I prepare a newsletter issue (some random point in the future), please let me know by Discuss message or by email at (gabriel.scherer at gmail).

Context

The Multicore merge is behind us now. We are in the final preparation stages for 5.0 (but by no means the end of the Multicore-related work, many things were left to do in 5.1 and further releases). The non-Multicore-development has been restarting slowly but surely.

@yallop Jeremy Yallop

We’re starting up the modular macros work at Cambridge again, with the aim of adding support for typed, hygienic, compile-time computation to OCaml. Back in 2015 we presented our original design at the OCaml Users and Developers Workshop, and we subsequently went on to develop a prototype in a branch of the OCaml compiler. We’re planning to complete, formalise, and fully implement the design in the coming months.

@dra27 David Allsopp

Various bits of house-keeping on the compiler distribution have been managed for 5.0, taking advantage of the major version increment. All the compiler’s C symbols are now prefixed caml_, vastly reducing the risks of conflicts with other libraries (#10926 and #11336). The 5.x compiler now installs all of its additional libraries (Unix, Str, etc.) to separate directories, with a friendly warning that you need to specify -I +unix etc. which makes life slightly easier for build systems (#11198) and the compiler now also ships META files for all its libraries by default (#11007 and #11399). Various other bits of 5.0-related poking include a deprecation to allow the possibility in future for sub-commands to the ocaml command. Instead of ocaml script, one should now write ocaml ./script (#11253). The compiler’s bootstrap process (this is the mechanism which updates the initial compiler in boot/ocamlc which is used to build the compiler “from cold”) is now reproducible (#11149), easing the review process for pull requests which need to include changes to the boot compiler. Previously we required a core developer to re-do the bootstrap and then separately merge the work, where now a core developer can merely pull the branch and check that the committed artefact is reproducible.

Looking beyond the release of OCaml 5.0, I’ve also been working to resurrect the disabled Cygwin port of OCaml (#11642) and, more importantly, getting the MSVC native Windows port working again (that’s still WIP!).

At the OCaml Workshop this year, I demonstrated my “relocatable compiler” project, which aims both to eliminate various kinds of “papercut” when using bytecode executables but, much more importantly, allows compiler installations to be copied to new locations and still work, which paves the way for faster creation of opam switches everywhere. It was great to be able to meet so many people in person in Slovenia for the first in-person workshop since 2019, but unfortunately that came at the cost of catching COVID, which has slowed me down for the weeks since! The next stage for the relocatable compiler is to have an opam remote which can be added to allow opt-in testing of it with OCaml 4.08-5.0 and then to start opening pull requests hopefully for inclusion of the required changes in OCaml 5.1 or 5.2.

@sadiqj Sadiq Jaffer

The bulk of my upstream work over the last year in OCaml 5.0 has been on Runtime Events, a new tracing and metrics system that sits inside the OCaml runtime. The initial PR can be found at #10964 and there was a separate documentation PR in #11349. Lucas Pluvinage has followed up with PR #11474 which adds custom application events to Runtime Events and I hope isn’t too far off merging. We gave a talk at the OCaml Users and Developers workshop on Runtime Events and I’m hoping there will be a video up on that soon.

@garrigue Jacques Garrigue

We have continued our work on refactoring the type checker for clarity and abstraction.
An interesting result was PR #11027: separate typing of counter-examples from Typecore.type_pat. Namely, around 2015 type_pat was converted to CPS style to allow a more refined way to check the exhaustiveness of GADT pattern-matching. A few more changes made the code more and more complex, but last year in #10311 we could extract a lot of code as case-specific constraints. This in turn made possible separating type_pat into two functions: type_pat itself, only used to type patterns in the source program, which doesn’t need backtracking, and check_counter_example_pat, a much simpler function which works on counter examples generated by the exhaustiveness checker.
I have also added a -safer-matching flag for people who don’t want the correctness of compilation to depend on the subtle typing arguments involved in this analysis (#10834).
In another direction, we have reorganized the representation of type parameter variances, to make the lattice involved more explicit (#11018).
We have a few PRs waiting for merging: #11536 introduces some wrapper functions for level management in types, #11569 removes the encoding used to represent the path of hash-types associated with a class, as it was not used in any meaningful way.

There are also a large number of bug fixes (#10738, #10823, #10959, #11109, #11340, #11648). The most interesting of them is #11648, which extends type_desc to allow keeping expansions inside types. This is needed to fix a number of bugs, including #9314, but the change is invasive, and reviewing may take a while.

Coqgen, the Coq backend (previously named ocaml_in_coq), is still progressing, with the addition of constructs such as loops, lazy values and exceptions Coqgen, and we are trying to include GADTs in a comprehensive way.

@gasche Gabriel Scherer

@nojb Nicolás Ojeda Bär has done very nice work on using tail-modulo-cons for some List functions of the standard library, which I helped review along with Xavier Leroy:

#11362: List.map, List.mapi, List.map2
#11402: List.init, List.filter, List.filteri, List.filter_map

Some of those functions were hand-optimized to use non-tail-rec code on small inputs. Nicolás’ micro-benchmarks showed that often the TMC-transformed version was a bit slower on very small lists, up to 20% slower on lists of less than five elements. We wanted to preserve the performance of the existing code exactly, so we did some manual unrollling in places. (The code is a bit less readable than the obvious versions, but much more readable than was there before.)

I worked on fixing a 5.0 performance regression for bytecode-compiled programs ( #11337 ). I started with the intuition that the overhead came from having a concurrent skip list in the 5.x runtime instead of a non-concurrent skip list in the 4.x runtime, and wrote tricky code to use a sequential skip list again. Soon I found out that the performance regression was due to something completely different and had to dive into the minor-root-scanning code.

When I started looking at the multicore runtime, I had no idea how to print a backtrace from a C segfault without using valgrind. I wrote some documentation on debugging in runtime/HACKING.adoc in the hope of helping other people.

I spent some time reading lambda/switch.ml, which compiles shallow-match-on-constructor-tags into conditionals and jump tables. The file contains some references to research papers from the 90s, but it was unclear to me how they connected to the implementation. After a nice discussion with Luc Maranget I could propose a documentation PR #11446 to explain this in the source code itself. Thanks to Vincent Laviron for the nice review – as always.

@gadmm Guillaume Munch-Maccagnoni

(written by @gasche)

Guillaume worked on updating the “GC timing hooks” and caml_scan_roots_hook of the OCaml runtime to be multicore-safe, and added a new hook caml_domain_terminated_hook. (#10965, #11209) We rely on runtime hooks in our experimental boxroot library, and updating hooks for 5.0 was necessary to have a correct 5.0 version of boxroots.

Also related to our boxroot experiments, Guillaume wanted an efficient way to check whether a domain thread was currently holding its runtime lock – it does not when executing long non-OCaml computations that do not access the OCaml runtime. Guillaume changed the Caml_state macro to provoke an error when accessing the domain state without holding the domain runtime lock – a programming mistake that could easily go unnoticed before in simple testing and crash in hard-to-debug ways on different inputs – and introduced a new Caml_state_opt macro that is NULL when the runtime lock is not held. (#11138, #11272, #11506).

Guillaume worked on quality treatment of asynchronous actions in the new Multicore runtime. (#10915, #11039, #11057, #11095, #11190). Asynchronous actions are callbacks called by an OCaml program by the environment (instead of an explicit request by the programmer at the point they happen). They include for example signal handlers, finalizers called by the GC, Statmemprof callbacks. Supporting them well requires tricky code, because the runtime must ensure that such actions are executed promptly, but in a context where running OCaml code is safe. (For example it is easy to have bugs where a asynchronous action raises an exception in the middle of a runtime function that is not exception-safe.) The 4.x runtime had a lot of asynchronous-action fixes between 4.10 and 4.14, but sadly many of these improvements were not backported in the Multicore branch (they required expert adaptation to a very different runtime codebase), and were thus lost in the Multicore merge. The present work tries to come back to a good state for 5.0 and 5.1 – some of the fixes were unfortunately not merged in time for 5.0. Statmemprof support is currently disabled for 5.x, and this work will also be useful for Statmemprof.

nojb · November 9, 2022, 6:32am

Thanks @gasche for the nice summary!

gasche · November 9, 2022, 9:24am

You are welcome. I ended up mentioning your TMC work, but not the rest (I thought of writing a section for you because there is much other cool work to mention, but in the past the reception to me writing about other people’s work has been mixed.)

nojb · November 9, 2022, 11:01am

Personally, I would have been grateful if you had done so Anyway, I should have summarized some of my changes, so it’s on me I will try to do better next time. Thanks!

Cheers,
Nicolas

hyphenrf · November 10, 2022, 5:05am

Amazing, so many great things to look forward to! I thought you all would be exhausted after the multicore merge but I’ve been proven wrong in the most exciting way

gadmm · November 14, 2022, 2:53pm

Thanks for the summary @gasche. (@gasche asked me to write about what I did; it took me too long to do it, so he wrote something for me.) In addition to what @gasche described, many things I have worked on are not visible in the change log, notably bug fixing and cleanups for the new multicore runtime.

Systhreads

The runtime lock check feature was long-time requested by some foreign-function interface (FFI) users, also its implementation sent me into a rabbit hole of fixes with the new systhread implementation (#11271, #11385, #11386, #11390, #11473, #11403), with some fixes still pending. The multicore systhread implementation was new code with few experts, so this made it benefit from needed attention.

Memory model

In a discussion in a conference call with OCaml developers shortly after the merge (to which I was invited thanks to @kayceesrk), I was asking whether the accesses to the OCaml heap from C (runtime, VM and FFI) should not be made atomic. At the time, the C parts of the runtime were accessing the OCaml heap through non-atomic loads and stores. (More generally, adapting the OCaml runtime and existing FFI code to multicore requires explaining first how to implement the OCaml memory model in the C memory model.) I do not remember the exact wording (and I would not be able to quote it given that the meeting was unfortunately non-public), but I remember that the answer did not convince me.

I later came back with concrete examples of undefined behaviour involving the Field macro (#10992). This put it back on the radar and enabled OCaml developers to start addressing this issue. Unfortunately, it appears that the question of the OCaml memory model for C code will still not be clarified as of 5.0.

The challenge is finding a good balance between backwards-compatibility, efficiency and standards-compliance. A first part was addressed by marking accesses via the Field macro volatile (#11255). The C volatile keyword does not imply atomicity according to the standard, but it is used in many legacy codebases in this way, so this is likely to work while remaining (mostly) backwards-compatible. (To my knowledge only Field which I gave as an example has been fixed. I suggested that all runtime, VM and FFI API should be reviewed in the new light, but I am not aware that developers had the time to do this yet. Similarly, user C code that is not using strictly the documented FFI and instead relies on knowledge of the OCaml value representation in order to access values, will have to be audited before any use inside multicore programs.)

Another issue still to be addressed is synchronizing the read of initial writes to values, as needed for memory safety (this problem does not appear in the PLDI2018 paper, because it does not accurately model initial writes). Even leaving backwards-compatibility for the FFI aside, using memory_order_acquire on mutable loads would be cheap on x86-64 but expensive on Arm. Instead, OCaml relies on some other order-preserving property that comes for free on Arm (address dependency). Now, C compilers do not understand this notion yet (McKenney 2017), let alone let OCaml offer a backwards-compatible and standards-compliant solution.

One solution which @kayceesrk and I proposed involves requiring users to change reads of mutable fields to using a different dedicated function or macro (doing an acquire atomic load), with some provisions that they can adapt their program to multicore gradually (e.g. legacy code still works due to absence of races/parallelism). (This is a milder situation in terms of backwards-compatibility than the one with the original concurrent multicore GC that led to its abandonment in favour of the stop-the-world design.)
Another path I proposed was to rely on de facto preservation of dependency ordering by current compilers (modulo a certain number of constraints and fingers-crossings), taking inspiration from the “Linux Kernel memory model” (which as it names does not indicate, denotes the absence of one); and to subsequently work towards specifying what this means in terms of OCaml’s requirements. This would be the ideal approach, because it does not require legacy programs to be fixed. I had the chance to discuss this approach with @stedolan in Ljubljana in September, and without entering into technical details, one conclusion was that it was possibly interesting work, but very hypothetical. Moreover, other users of dependency ordering (e.g. Linux RCU) have different requirements, and are currently pushing propositions to evolve C/C++ that are going in directions that are not necessarily suited to OCaml’s needs on this issue.

The risk is that OCaml programs end up in a no-man’s-land in terms of C standards due to the volatile fix working well enough (so far; as far as we know; etc.), and this even in the event where a purely standards-compliant solution was later adopted by OCaml (but where users would lack motivation to adapt their code).

This issue has not received as much discussion as I had hoped; there was no feedback on the solutions. Unfortunately, this problem not being addressed for 5.0 means that OCaml developers might be envisioning breaking FFI changes later on, including changes that could cause API breakages in the multicore OCaml-Rust interface which I have been working on.

Lazy in multicore

I came up with a design for thread-safe Lazys in multicore that would allow various efficient (and custom) synchronization schemes (ocaml-multicore#750) and I started to implement a prototype. Lazy is currently thread-unsafe in multicore (although memory-safe); thread-safe lazys would allow to implement optimally globals with delayed initialization, one common and convenient form of synchronization in multithreaded languages (cf. “magic statics” in C++11, lazy statics in Rust), along with more forms of lazy that are useful in theory but that have not been tried in practice yet.

Unfortunately there is a tedious bootstrapping issue with my prototype, and I have not made progress since (help is welcome). Another lack of motivation comes from the fact that the lazy short-cutting has been removed in OCaml 4.14 for fear it was too expensive for GC marking. If not for short-cutting, it would be possible to implement lazy in a much simpler way directly in OCaml. So the runtime is currently in a weird state where 95% of the invasive work to have the short-cutting optimization is still there, but the optimization is not done (actually, only happens for lazys that are forced while still young). Luckily, a side-effect of my work presented at the OCaml workshop is to show that lazy short-cutting comes essentially for free during GC marking thanks to instruction-level parallelism on modern processors, so I expect the core OCaml developers to put it back in.

Pooled mode

The work on Boxroot was also the occasion to see whether the obscure “pooled” mode whereby OCaml frees memory for you (OCAMLRUNPARAM=c), was useful to us. I had a PR to fix bugs in this mode and to report other design issues, but nobody was reviewing it after a year so I closed the PR. My reasoning is that since the mode is unmaintained and its design is broken, we can as well advise FFI users to steer clear from this mode.

Other projects

@gasche’s original message mentioned my work on restoring some safety guarantees of asynchronous callbacks in OCaml multicore. It is in fact a spin-off of past work where I aimed to show that it is possible to mix certain systems programming requirements (correctly releasing resources and handling errors) with certain functional programming requirements (being able to interrupt computation asynchronously), given the right language design. (My work on memprof-limits was another part of this work.) AFAIU OCaml multicore was originally developed with the idea that it is not really possible to handle asynchronous interruptions nicely at a language-design level, which explains that they were given second-class status in the new runtime, despite many existing programs using them.

The work on Boxroot presented with @gasche at the ML workshop is meant to enable the development of safe interfaces between OCaml and Rust. I am working on such an interface for multicore OCaml to serve as a common building block for the few other projects in this area, by giving a reference implementation of the multicore memory-safety model. We also opened an RFC opening the possibility to integrate Boxroot with the OCaml runtime, but there are a few things to clarify before this can be considered.

At the OCaml workshop I presented an implementation in the runtime of a large page allocator with support for huge pages (THP) and efficient pointer classification (i.e. what you need to point to dynamically-allocated memory in the GC heap with sharing-like properties, à la Ancient and OCamlnet). It is part of longer-term goals, along with Boxroot, aiming to show how it is possible to mix tracing GC with the idea of linear allocation with re-use coming from linear logic (a.k.a. “functional but in place” in recent papers). (Such a hybrid allocation scheme could open now possibilities in programming for large data or low latency requirements in functional programming.) This is loosely inspired by old papers by Jean-Yves Girard on mixing linear, intuitionistic and classical reasoning inside the same deductive system.

All this is part of a longer-term investigation into the mixing of systems and functional programming centred on the notion of resource as a first-class value. (I have various ideas in this area ranging from hands-on to very theoretical; students in France or in Cambridge UK should feel free to get in touch with me if they are interested overall by this subject.)

reisenberg · November 14, 2022, 5:14pm

The Jane Street compilers team is hard at work on a number of fronts, though perhaps less visibly than in the past. Previously, we worked on a feature and got it merged upstream without releasing it internally. This meant that our colleagues would have to wait for a main trunk release of OCaml before getting the new feature, a turnaround time that was sometimes too slow. In order to decrease latency between feature conception and use, we have now changed to a new workflow where we develop and release features internally while we work to upstream.

Not only does this new workflow get features into the hands of our developers faster, it also allows us to improve the final product:

The upstreaming discussion can be informed by actual practice by the 700+ OCaml developers within Jane Street. We can, for example, examine developer adoption rates and performance impact with ease.
Because we can access the entire code base where a feature is deployed, we can correct design mistakes with low cost. For example, if a feature proves too confusing to use in practice, we can change its syntax. Indeed, Jane Street regularly makes broad updates to its code base, and this kind of change fits within our workflow. For main-trunk OCaml, this means that features are more fully tested than they could be otherwise.

Upstreaming – contributing back to an open-source project and working in a language that reaches beyond our walls – remains a core value for the compilers team. (Indeed, one of my explicit job responsibilities at Jane Street is to help facilitate this process.) In addition, all our compiler work is fully open-source.

That said, we have a number of features we’re excited to share progress on, listed below. Expect to see proper proposals for these (posted to the RFCs repo) in due course.

Local allocations, implemented by Stephen Dolan and Leo White. This adds support for stack-allocated blocks, enforcing memory safety by requiring that heap-allocated blocks never point to stack-allocated blocks (and stack-allocated blocks never point to shorter-lived stack-allocated blocks). Function arguments can be annotated as local_ to say that the function does not store that argument. For example,

val map : 'a list -> f:local_ ('a -> 'b) -> 'b list

says that map does not store the function it is passed anywhere in the output, allowing the function closure to be stack-allocated. Stack-allocated record, variants, etc., are also possible. See also Stephen Dolan’s talk at ML’22.

This is a large addition to OCaml’s type system, and we’re still learning about how to use it best. We are actively learning from its deployment within Jane Street to influence the final version of this feature for upstreaming.

include functor, implemented by Chris Casinghino. This syntactic extension allows a module to include the results of applying a functor to the prefix of the module already written. For example:

module type Indexed_collection = sig
  type 'a t
  val mapi : 'a t -> f:(int -> 'a -> 'b) -> 'b t
end

module Make_map(M : Indexed_collection) : sig
  val map : 'a M.t -> f:('a -> 'b) -> 'b M.t
end = struct
  let map t ~f = M.mapi t ~f:(fun _ -> f)
end

module List = struct
  type 'a t =
  | Nil
  | Cons of 'a * 'a t

  let mapi t ~f = ...

  include functor Make_map
end

let evens = List.(map (Cons (1, Cons (2, Cons (3, Nil)))) ~f:(fun x -> x*2))

Despite being “just” a syntactic convenience, this has already received wide uptake within Jane Street.

Comprehensions, implemented by Antal Spector-Zabusky. This adds support for both list and array comprehensions, such as

let pythagorean_triples_up_to n =
  [ a,b,c
    for a = 1 to n
    for b = a to n
    for c = b to n
    when a * a + b * b = c * c ]

The feature works similarly for arrays, with [| ... |] syntax.

Immutable arrays, implemented by Antal Spector-Zabusky. This adds support for immutable arrays. These are just like normal arrays, but immutability ensures that the contents of the array do not change.

Unboxed types, with a very early implementation by Chris Casinghino and proposal by me (with the help of my colleagues). Stephen Dolan and I have given talks about the design.

Currently, all variables and fields must store values that are stored in a single machine word (e.g. 64 bits); furthermore, this word must either be a pointer to garbage-collected memory or have its bottom bit tagged to denote that the garbage collector should skip it. Unboxed types relax this restriction, allowing a single variable or field to hold structures smaller or larger than a word, or store a word that the garbage collector will know not to scan. In so doing, unboxed types effectively allow records and variants to be inlined into one another, enabling programmers to structure their compile-time abstractions differently than their run-time memory layout.

The core innovation is the notion of a layout, which classifies a type (much like a type classifies a value). A layout describes how a value should be stored and whether it can be examined by the garbage collector. By assigning layouts to abstract types and type variables, we can abstract over non-word layouts. Much more is in the proposal.

Polymorphic parameters, implemented and documented by Leo White. This feature allows function parameters to have polymorphic types. For example:

let select
  : 'b 'c. selector:('a. ('a * 'a) -> 'a) -> ('b * 'b) list -> ('c * 'c) list -> ('b * 'c) list
  = fun ~selector bbs ccs ->
    List.map2 (fun bb cc -> selector bb, selector cc) bbs ccs

The select function chooses either the first components or the second components of the input lists to comprise the output list. The selector function must be polymorphic, because the elements in the pairs in the input lists may have different types. (Note that 'b and 'c are universally quantified in the type of select, so we know that the type checker does not unify them.)

gasche · November 15, 2022, 10:29am

My understanding of this issue is as follows:

None of the core developers is actively working on multicore lazy, so you would probably have to contribute yourself if you want to improve them. (I have on the occasion tried to help you along, and would be happy to do it again, but right now I would rather focus on boxroot.)
The reason why shortcutting was disabled is the interaction with GC prefetching which, as you explain, is currently missing from Multicore. Prefetching is important for performance in 4.x, and Jane Street would like to see it back in 5.x as well. Someone at Tarides is working on this (I don’t know who, KC would). So it sounds likely that prefetching will make a comeback for 5.x, and one would have to adapt the implementation again to work with lazy shortcutting.

You posted your patch to the prefetching code to re-enable lazy shortcutting in https://github.com/ocaml-multicore/ocaml-multicore/issues/750#issuecomment-986847502, with preliminary benchmarks suggesting that it does not decrease the performance of marking. On the same issue, @lpw25 wondered whether we really need shortcutting (what lazy-using programs actually depend on it for performance), suggesting that it is not a high-priority issue.

One difficulty that I foresee is that there is no one strongly pushing for lazy shortcutting to come back (this somewhat supports Leo’s point), and at the same time it is hard to convince the Jane Street people that the prefetching change you suggest is innocuous for their workflows (because only they can run the benchmarks they care about, and apparently they are not motivated enough by this issue to run them with your patch).

kayceesrk · November 15, 2022, 4:51pm

Fabric Buoro at Tarides is working on this.

xavierleroy · November 15, 2022, 6:25pm

I’m not sure how and why private experiments at Jane Street or by @gadmm end up in this summary of what’s going on in the core OCaml system. But, for the record, I care about performance of lazy evaluation in OCaml, I believe that the shortcutting optimization is important for this, and I’d like to see it reimplemented in OCaml 5 at some future point.

gasche · November 15, 2022, 8:04pm

In inviting people to contribute to the newsletter if they wish, I use a broad interpretation of “compiler development” that includes non-upstreamed experiments, in particular Jane Street or Guillaume’s experiments (or others). I think that it is nice to learn about what other people are cooking, it can give us advance notice of discussions of interest, and let people realize that they are interested in related things and consider joining forces.

Yaron_Minsky · November 17, 2022, 2:04am

I appreciate that approach! And we’re happy to communicate our progress in whatever medium works best for the community.

In any case, I think we’ve erred in the past in under-communicating, and we’re trying to do better on that front, and Richard’s post here is meant to be a step in that direction.

It’s maybe worth reiterating the point that Richard made, which is that, while these are all currently patches against our internal branch of OCaml, we’re committed to doing the work to get them into a state that upstream will be excited to accept them. We have no interest in having a long-term fork of the language, and we’re aiming for language features that fit naturally into the language, rather than special-purpose hacks for our specific use-cases.

y

sid · November 17, 2022, 6:38am

It was a pleasant surprise to see @reisenberg 's post in discuss.ocaml.org ! I feel happy that a prominent Haskell compiler contributor has joined us . Welcome to the OCaml community !

Thanks for giving an insight into what the Jane Street compilers team is working on for its customised version of the OCaml compiler. I would naturally expect Jane Street to use more and more of these custom features in its own codebase going forward.

But because of the way development generally works many of these custom features may never find their way into the main OCaml trunk (e.g. wider OCaml team disagrees on the inclusion of the feature for any number of reasons) while some of them might arrive after a significant delay. The delay is likely to be significant on the order of ~years as consensus is built and implementations possibly modified/peer reviewed in the interim.

While the intention and desire may be to always upstream the reality is that there might be an increasing divergence in the compiler codebases.

In the meanwhile the OCaml community depends a lot upon Jane Street libraries. How would the newer versions of the libraries that progressively use more and more of custom language be made available to the OCaml community?

I would suppose you would need to strip out all the custom features so that it can run on plain vanilla OCaml or use something like cppo ? Will that always be practical though?

TL;DR How will this enhanced push towards using a custom OCaml compiler impact Jane Street’s future releases of open source OCaml libraries that currently only need a vanilla OCaml distribution?

Slightly Offtopic

I asked @snowleopard (another notable Haskeller who joined the OCaml community) a year ago:

It would be interesting to get your experience on the transition into OCaml. What do you find good/delightful in OCaml generally speaking (and in comparison with Haskell) and what would you like to change in OCaml?

What followed was an interesting response – See [Job] Build System Engineer at Jane Street - #3 by snowleopard

It would be great to get your insights too, Richard Eisenberg, should you feel so inclined to answer !

Yaron_Minsky · November 17, 2022, 11:06am

This is an interesting question, but really off-topic for this thread. I’ll answer in a new thread.

Topic		Replies	Views
Ocaml-multicore: report on a June 2018 development meeting in Paris Ecosystem multicore , compiler	10	9979	August 27, 2019
OCaml compiler development newsletter, issue 5: November 2021 to February 2022 Community compiler-newsletter	0	1656	March 3, 2022
OCaml compiler development newsletter, issue 4: October 2021 Community compiler-newsletter	9	2706	November 25, 2021
The road to OCaml 5.0 Ecosystem announce	17	23836	October 15, 2021
Multicore Update: April 2020, with a preprint paper Community multicore , compiler , multicore-monthly	27	9330	June 5, 2020