Multicore Ocaml vs Thread

I’m a bit ashamed of asking this question, but what’s the difference between multicore and Threads ?

I partially know the answer, multicore is “true” concurrency, and the GC/memory model is adapted to that. OCaml cannot do true concurrency on several cores because of a global lock in the GC.
Multicore-ocaml also comes with algebraic effects that allow safer programming with concurrency.

But then, what does Threads actually do ?

In the following program :

let () = 
  let x = ref 0 in
  let do_ = fun () -> x := !x + 1;  Format.printf "%i\n@?" !x in
  do_ ();
  let t1 = Thread.create do_ () in
  let t2 = Thread.create do_ () in
  Thread.join t1;
  Thread.join t2

what actually happens ?

The root of my question is, right now, I’m working on a project that uses Unix.fork all over the place to compute things faster. The issue is that it’s not possible to join the results in the end, the only thing possible is to print the result to a file or stdout.
If I used Threads instead, would it be slower because it cannot actually run in parallel ?

8 Likes

two thoughts: (1) yes, if your program actually needs more cores for Ocaml code, it’ll run slower with the current runtime design.
(2) is your problem amenable to an explicit parallelism solution? E.g., MPI, or perhaps @XVilka (I think it was) has libraries for explicit parallelism (and manages all the copying, process-creation, etc).

Maybe you can use hack_parallel ?

See https://github.com/rvantonder/hack-parallel-example

You probably mistaken. I didn’t publish any parallel execution libraries. For now I have only manage everything manually in the code with the help of Lwt_pool and Lwt_preemptive. For dealing with maps there is also a very simple and minimalistic parmap library.

Hm … I -do- think it was this parmap package, but heck, I’m gettin’ old, memory’s fadin’, no refresh pulse is comin’ anytime soon. Ah, well. This parmap package looks familiar though.

There’s also mpifold. It is kind of similar to a map using MPI.

Keep in mind that I wrote mpifold while learning Ocaml.

IIRC Jean-Christophe Filliatre and one of his students wrote a map/reduce implementation in Ocaml. I don’t know where it’s stashed, though …

But then, what does Threads actually do?

Well, as I understand it, you have a kinda parallel semantics with Thread.{select,delay,yield,etc...}, so you can have kinda parallel computations, even though in practice runtime utilize only one actual OS thread at a time. Python also have such threads despite having GIL, I think it’s for the same purpose.

Second, you can actually unlock GIL from the C code, so you can do some heavy or blocking computations in parallel, given these computations are done in some library called from ffi.

You can read more in a great Unix Programming in OCaml book:

https://ocaml.github.io/ocamlunix/ocamlunix.html#sec152

Functory: http://functory.lri.fr/

The webpage does not mention it, but I think it is also available as an Opam package.

Wow so many great answers thank you so much everyone !
I’ve really never done any concurrent programming so I wasn’t aware that explicit parallelism was a thing.

Thank you for the libraries propositions as well as explanations. It is clearer now, and I’ve taken a quick look at each libraries and most of them seem to match what we need (our use case is quite simple really).
I’ll probably end up picking the one that has the least clashing dependencies with what we already have.

Thank you so much !

If you are already using Lwt then Lwt_preemptive is the way to go (assuming you are OK with the default behaviour of single thread executing on the machine at any given time). If you are looking for true parallelism then one of the other libs is the way to go.

1 Like

This is not a silly question at all. Allow me to refine your definitions a little to get them more accurate.

Concurrency is how we partition multiple computations such that they can run in overlapping time periods rather than strictly sequentially. OCaml already has excellent support for concurrency in a single heap – for example explicitly via user libraries such as Lwt or Async. The venerable OCaml Thread module is also a fine way to express concurrent computations, as each of them run with their own thread of control flow.

Parallelism is the act of running concurrent computations simultaneously, primarily by using multiple cores on a multicore machine.

You can get a certain amount of speedup using just concurrent programming, for example when doing I/O intensive operations so that your program is not waiting around for one connection to receive data while other connections might be able to process other data. This is the primary usecase of both Lwt and Async. When you don’t want to pepper your program logic with monads, then the Thread module accomplishes a similar task (Dune does this, for example).

However, these all still run on a single core until you unlock parallelism in the runtime. The Thread module takes a lock on the runtime when an OCaml thread is scheduled to prevent any others from running at the same time. However, C threads can continue to run in parallel just fine – it’s only the OCaml runtime that is locked against parallel executions.

Thus were born many of the solutions listed above. In order to obtain parallelism, they run multiple OCaml runtimes, for example in different processes, and use communications channels to share data between them. Data structures have be marshalled across processes or otherwise carefully shared using IPC primitives, but it works pretty well for many usecases. The key drawback is that data structures cannot be shared seamlessly across runtimes, so you need to do some extra work in your application.

Which leads onto multicore OCaml: you can use Domains as units of parallelism, and explicitly manipulate shared data structures in a single heap using multiple processors simultaneously. The current push for multicore OCaml does not advance the state of concurrency for the first chunk of upstreaming, but provides the runtime primitives (mostly via the Domains module) to start parallel flows of control. Improving the state of concurrency is an entirely separate development effort that we’re working on too.

20 Likes

Thanks for such a detail reply !
It perfectly answers everything I was wondering :smile: