Whatever happened to the idea of multi-VM OCaml runtimes?

I remember long ago, there was an idea of building a “multi-VM” version of OCaml, where you would have multiple complete OCaml abstract machines in a single address-space. So: separate heaps, separate sets of threads, etc. And then using message-passing a la Erlang for communication. And each VM would be like a current GIL OCaml VM (so pre-multicore).

I wonder what happened with that? It seemed to me like an elegant solution to incorporating SMP or NUMA parallelism, while not incurring all the complexity of true multithreaded runtimes and GC.

In short, taking a page from Erlang, and also from MPI.

Maybe it just died b/c nobody cared?

1 Like

What would be missing if you could do most of that with a custom build:

  1. Define your own main() to initialize your message passing library
  2. Compile your OCaml code with ocamlopt -output-complete-obj (aka. (modes (native object)) in Dune) to get an object file xyz.o containing the runtime
  3. Use objcopy --redefine-syms on xyz.o to create xyz1.o, xyz2.o, etc. For each copy, redefine the CAML symbols (ex. _Caml_state to _Caml1_state, _caml_alloc to _caml1_alloc, etc.).
  4. Remove duplicate symbols, if any, from xyz2.o, xyz3.o, etc. with objcopy --strip-symbols
  5. Spawn caml_startup_1(), caml_startup_2(), etc. in threads from main()
  6. Use normal external message_pass_put : int -> string -> unit = "my_message_passing_function", etc. FFI statements to communicate from the many OCaml runtimes to the C message passing library.
  7. Link all the xyzN.o, your main() function, and probably a message handler to shutdown the main() function, into a single executable

Having said that, I don’t know why anyone would do that unless they were embedding OCaml inside of a bigger C program that was already using some message passing framework. I’m curious if that (or something similar) is a real use case.

1 Like

So the, uh, motivation (IIRC) was to reproduce the Erlang model in OCaml. There, Erlang “processes” (== “threads”) are significantly lighter-weight than UNIX processes.

1 Like

Some of the changes made for multicore brough this closer, but still much work remains. See https://github.com/ocaml/ocaml/pull/8713#issuecomment-498908086 for some general commentary.

I know some people care, but I don’t think anyone is actively working/pushing for this.

Cheers,
Nicolas

Yes, this is a real use case. The classical way to do parallelism in OCaml was by using multiple processes and message passing. If one could run multiple independent runtimes in a single process I think it would make this design more lightweight.

For large codebases that use a lot of global state, it can be essentially impossible to rewrite them to make use of OCaml 5-style parallelism. So the above (multiple runtimes running in a single process) would be a nice intermediate step.

Cheers,
Nicolas

1 Like

Plus, rewriting your message-passing program to ocaml-5-style parallelism might be disappointing in terms of parallelization performance.

1 Like

Thank you! I think that would be useful for developing PostgreSQL extensions with OCaml. Should try that approach, because right now it’s not possible to have several extensions loaded into PostgreSQL process at the same time.

1 Like

There used to be netmulticore (which was experimental AFAIU). It used an external heap shared between several OCaml processes (so it shared some part of the address space, with something more elaborate than message-passing in mind, depending on what you count as message-passing). It comes with its own set of challenges but I have the impression that this was not perceived as giving rise to worthy questions at a scientific level.

I am broadly interested in this, for various reasons. For instance, for low-latency applications you cannot mix a low-latency domain with a high-latency domain in the OCaml stop-the-world design, so an abstraction above that of domains makes sense to me.

ocamlnet and its netmulticore module is something that comes to mind obviously.
I have used it (it was the backend in parany for some time), and I think it was in production at some companies where Martin Jambon worked in the past.
So, I think it was not so experimental but rather “production ready”.
I think that the design idea of having a special Gc for things which are shared between processes and having those things clearly marked was not such a bad idea.

Thanks, I did not know that netmulticore was used in production (though it did look pretty elaborate for a mere prototype). It is even worse than I thought.

IIRC part of the Erlang “special sauce” was such a GCed heap for shared immutable strings. But I could be mistaken.

As far as I know, Erlang’s VM is full of very specific design choices
that will apply to no other runtime in existence, and definitely not to
OCaml’s.

In no particular order:

  • the only shared values are (immutable, refcounted) large blobs
  • all other values are deep copied when sent from a process to another processes
    (possibly living on a remote machine). The data model of Erlang means
    that all values are serializable in ETF,
    a bespoke but specified binary format.
  • processes totally own their heap (I don’t think it’s generational, but
    it can be quite tiny anyway). There is basically no sharing besides
    large blobs.
  • less relevant, but processes are fully preemptable so they’ll never
    be stuck in an infinite loop.

Anyway, I don’t think looking at Erlang for inspiration for OCaml is
super useful. It’s very interesting but ML gives the user a lot more
power, including mutation.

Looking at Erlang is definitely interesting for OCaml (for instance, for its error model).

I actually experimented with this, but as mentioned it is a lot of work.

It would be really useful to be defensive without spawning another process and maybe even sharing some immutable data.