Concurrency in Ocaml: It seems like the CML style never took off for anybody?

[Full disclosure: I’m pretty much a threads/mutex/condvar hacker myself, having spent too many years putting out Java server dumpster-fires to really be able to switch to one of these other styles the Kids These Days come up with (j/k j/k: I used CML for a SUNRPC project in SML in 1990; I realize it’s not hot off the griddle)]

I’m curious about whether anybody in the Ocaml community uses the CML style of concurrency? Ocaml has had basic support for CML since forever (“Event”) but I’ve never seen anybody use it. There are probably lots of technical reasons for that, but first, I’m curious if anybody out there actually likes this model of concurrency for ML programming. After all, it’s pretty successful in Erlang, and (heh) Golang seems to be giving it the ol’ college try, too.

No need to respond unless you’re really into CML/CSP/etc. I mean, my default assumption is, nobody does it, b/c threads/promises/etc are where the main body of programmers is. I’m certainly there.

Thanks in advance!

2 Likes

That’s interesting, I didn’t actually know that OCaml supported CML style. I just found this sample code that demonstrates it: http://caml.inria.fr/pub/docs/oreilly-book/html/book-ora177.html

Do you have pointers to any more example code for OCaml using events and channels? I’d like to get a feel for how you could do the usual async operations.

Ha! I’m in the same position as you are! OK, well, maybe (theoretically) epsilon better:

30 years ago I hacked with CML on the then-extant SML/NJ. And that’s it. In-between I had a commercial career, and … y’know the brain only has so much L3 cache space. I spooled that memory off to tape, which got lost in some warehouse a long time ago. So I’m basically looking for people who actually think this CML stuff is the way to go, and wanna talk about it.

Now, I should note that there is a very successful language community where this stuff is … “table stakes”. I’m talking about the Erlang community. The fact that they have nontrivial systems built using this structuring methodology/formalism, tells us that it isn’t bunk. So please don’t take this as me saying “gosh, is CML bunk?” It isn’t: Erlang is proof of that.

So: if you want to understand this model of computation, I guess the place to start is with Erlang. Me, I’m just trying to understand what the state-of-the-art in ML, regarding CML. If the answer is “look elsewhere”, that’s a perfectly fine answer, b/c I personally think in terms of thread/mutex/condvar. I know it’s wrong, but I can’t help it.

Ha!

P.S. Why is it wrong? There’s that famous paper “message-passing is superior to shared-memory” out of MSR, so many years ago, right? It pretty much settled the question, didn’t it?

3 Likes

I like Event. I used to teach its usage. And at that time, I used to write things that use it. Nowadays, not really.

I think there’s beauty in Event, but there’s also poor performance if you use it too intensively, which may happen easily when you’re carried away by the elegance of the thing instead of focusing on the practical objectives!

1 Like

I’ve built a couple of concurrent apps in Elixir, so using OTP and the BEAM, close enough to Erlang, and to be honest the OCaml channels and events feel somewhat different :slight_smile: Mainly I see differences in process supervision, encapsulated non blocking I/O, and the full-fledged concurrency-oriented standard library (OTP).

OCaml Event seems much more primitive in comparison, but I’ll keep an eye out for more interesting examples.

2 Likes

Very fair. Very, very fair. erlang’s well-developed semantics certainly leaves anything in the ML world … in the dust. I don’t know if even CML had process-supervision (kinda doubt it did).

1 Like

Almost random question: did anybody try to compile(or transpile) OCaml to BEAM(Erlang/Elixir)?

2 Likes

Some ML-like languages exist for the BEAM VM, like Alpaca and Gleam. Some people made Ocaml programs run on the BEAM VM apparently.

Other than that, i am not aware of an elixir/erlang_of_ocaml compiler.

I’m toying with that idea.

But right now the only non-toy ML’ish language for Erlang/beam is…surprise…PureScript — https://github.com/purerl/purescript

I cam across this thread last year (or maybe the year before? :sweat_smile:) while looking up CML, and I’ve been wanting to reply for a little while because I think CML is pretty cool! There isn’t much information online about its usage [1] so I don’t know how idiomatic my usage is, but I can take a stab at the question at least.

There are basically two different concurrency models provided out of the box with CML (although Futures/Promises can be implemented on top): CSP Channels (which seems to be used in Go?) and Actor Model Mailboxes (Erlang, Elixir, etc.).

The main difference I’ve seen is that “receive message” and “send message” operations block when using Channels. When trying to send a message to a Channel, the calling thread will block until another thread reads the message; when trying to read a message, the calling thread will block until another thread send a message to this channel.

The Mailbox only blocks on the “receive” operation by default (although there is a function to receive a message option, which doesn’t block and returns SOME message if there is a message to be read or NONE otherwise).

If you wanted non-blocking IO, you could send a message (probably an ADT or record with fields like which file path to save to and what string contents to save) to a Mailbox. The main thread won’t wait for the Mailbox to perform the IO before continuing to execute the rest of the program.

I think there is some (limited?) form of process supervision in this module but I’ve never had a need since I’ve only been developing locally-run GUI programs with it (not as excited about building web services even though Go and Elixir/Erlang specialise in them; sorry :sweat_smile:).

I usually stick with Mailboxes since I don’t want/need the blocking that comes with Channels, but the main concurrency model people seem to care about when it comes to CML is overwhelmingly CSP/Channels. Michael Sperber mentioned in a talk that only Channels are composable (and John Reppy’s book focuses mainly on them), Andy Wingo wrote that Channels are strictly more expressive, and the Racket documentation on concurrency talks about its inspiration from CML Channels but doesn’t mention Mailboxes at all.

I have some simple code I can go through to explain how CML informs my programs’ basic architecture, but I’m not sure if anyone would bother to read if this message was too long. :sweat_smile:

[1] There is John Reppy’s book on Concurrent ML which I’ve not read, as well as this PDF from 2002 which contains a section on CML and this other 2002 PDF which focuses on it, but I’ve spent little time with any of these.

Dunno if it’s exactly what you are looking for, but it’s also introduced in the section “7.7 Event-based synchronous communication” in Leroy and Rémy’s Unix system programming in OCaml book.

EDIT: I didn’t realize how old the original comment was…sorry about that.

1 Like

It’s definitively worth reading in my opinion, especially for the bits about defining compositional synchronisation primitives. It also provides quite a bit of background in general concurrent programming.

After that I’m not necessarily convinced about the idea, promoted there, of using “concurrency as a structuring tool”. It leads to very imperative code and processes communicating via mailboxes suffer from the problem of callback event driven systems: the logic defining your values ends up being split in different chunk of code which can be invoked in arbitrary orders. This makes it difficult to reason about your code.

1 Like

I can see how that’s the case. From the experience reports I’ve seen, it’'s common to spawn many, many threads at different points in the application, which is an imperative way of doing things.

CML’s Mailboxes actually helped me to separate IO from a purely functional core though, by creating multiple threads for different purposes (a render thread which submits data for drawing with OpenGL, an update thread which stores the core application state, a file thread, a network thread, etc.).

fun sendMsg (msg, drawMailbox) =
  case msg of Draw msg => Mailbox.send (drawMailbox, msg) | ...

fun sendMsgs (msgList, drawMailbox) =
  case msgList of
    hd :: tl =>
      let val _ = sendMsg (hd, drawMailbox)
      in sendMsgs (tl, drawMailbox)
      end
  | [] => ()

fun loop (app: AppType.app_type, inputMailbox, drawMailbox) =
  let
    val inputMsg = Mailbox.recv inputMailbox
    val app = AppUpdate.update (app, inputMsg)
   (* #'field app' is a selector which retrieves the value for a specific field in a record *)
    val () = sendMsgs (#msgs app, drawMailbox)
  in
    loop (app, inputMailbox, drawMailbox)
  end

The other threads dedicated to IO, once they’ve received a message, just pattern match on the message and have very simple and straightforward code left to do.

Maybe that’s not idiomatic CML; I wouldn’t know, but I appreciate the recommendation!