I remember long ago, there was an idea of building a “multi-VM” version of OCaml, where you would have multiple complete OCaml abstract machines in a single address-space. So: separate heaps, separate sets of threads, etc. And then using message-passing a la Erlang for communication. And each VM would be like a current GIL OCaml VM (so pre-multicore).
I wonder what happened with that? It seemed to me like an elegant solution to incorporating SMP or NUMA parallelism, while not incurring all the complexity of true multithreaded runtimes and GC.
In short, taking a page from Erlang, and also from MPI.
What would be missing if you could do most of that with a custom build:
Define your own main() to initialize your message passing library
Compile your OCaml code with ocamlopt -output-complete-obj (aka. (modes (native object)) in Dune) to get an object file xyz.o containing the runtime
Use objcopy --redefine-syms on xyz.o to create xyz1.o, xyz2.o, etc. For each copy, redefine the CAML symbols (ex. _Caml_state to _Caml1_state, _caml_alloc to _caml1_alloc, etc.).
Spawn caml_startup_1(), caml_startup_2(), etc. in threads from main()
Use normal external message_pass_put : int -> string -> unit = "my_message_passing_function", etc. FFI statements to communicate from the many OCaml runtimes to the C message passing library.
Link all the xyzN.o, your main() function, and probably a message handler to shutdown the main() function, into a single executable
Having said that, I don’t know why anyone would do that unless they were embedding OCaml inside of a bigger C program that was already using some message passing framework. I’m curious if that (or something similar) is a real use case.
So the, uh, motivation (IIRC) was to reproduce the Erlang model in OCaml. There, Erlang “processes” (== “threads”) are significantly lighter-weight than UNIX processes.
Yes, this is a real use case. The classical way to do parallelism in OCaml was by using multiple processes and message passing. If one could run multiple independent runtimes in a single process I think it would make this design more lightweight.
For large codebases that use a lot of global state, it can be essentially impossible to rewrite them to make use of OCaml 5-style parallelism. So the above (multiple runtimes running in a single process) would be a nice intermediate step.
Thank you! I think that would be useful for developing PostgreSQL extensions with OCaml. Should try that approach, because right now it’s not possible to have several extensions loaded into PostgreSQL process at the same time.
There used to be netmulticore (which was experimental AFAIU). It used an external heap shared between several OCaml processes (so it shared some part of the address space, with something more elaborate than message-passing in mind, depending on what you count as message-passing). It comes with its own set of challenges but I have the impression that this was not perceived as giving rise to worthy questions at a scientific level.
I am broadly interested in this, for various reasons. For instance, for low-latency applications you cannot mix a low-latency domain with a high-latency domain in the OCaml stop-the-world design, so an abstraction above that of domains makes sense to me.
ocamlnet and its netmulticore module is something that comes to mind obviously.
I have used it (it was the backend in parany for some time), and I think it was in production at some companies where Martin Jambon worked in the past.
So, I think it was not so experimental but rather “production ready”.
I think that the design idea of having a special Gc for things which are shared between processes and having those things clearly marked was not such a bad idea.
Thanks, I did not know that netmulticore was used in production (though it did look pretty elaborate for a mere prototype). It is even worse than I thought.
As far as I know, Erlang’s VM is full of very specific design choices
that will apply to no other runtime in existence, and definitely not to
OCaml’s.
In no particular order:
the only shared values are (immutable, refcounted) large blobs
all other values are deep copied when sent from a process to another processes
(possibly living on a remote machine). The data model of Erlang means
that all values are serializable in ETF,
a bespoke but specified binary format.
processes totally own their heap (I don’t think it’s generational, but
it can be quite tiny anyway). There is basically no sharing besides
large blobs.
less relevant, but processes are fully preemptable so they’ll never
be stuck in an infinite loop.
Anyway, I don’t think looking at Erlang for inspiration for OCaml is
super useful. It’s very interesting but ML gives the user a lot more
power, including mutation.