Hello everyone,
A few years ago my friend and I started a company called Terrateam, which does infrastructure orchestration on GitHub. Being that I am an Ocamler and we are a lean company, we chose to use Ocaml as our primary language. We recently went open source and I’m posting the link here to contribute an example of an actual company using Ocaml. A real repository.
The code can be found here.
There a few things to note about the repo:
- It’s a mono repo, so while many of the libraries in there are generic, they are not really individually consumable as is.
- We have our own concurrency framework (more on that below).
- We use our own build library (pds, which is in opam).
- The code is in flux all the time so things change rapidly.
Why did we build our own concurrency framework?
Disclaimer: Yet another concurrency framework? Yep! Do I expect anyone to use it? Nope, and that’s ok. It is designed for our needs. It’s meant to be maintainable by one person. It’s not meant to compete with Lwt or Async for mind share. If it grows, great, if it doesn’t, I’m happy still.
Our concurrency framework is called “Asynchronous Building Blocks” (Abb). It started over a decade ago when I was frustrated with a few things:
- I wanted kqueue support in Async, but (at the time) Async required modifying a handful of repos to support it and it just wasn’t obvious how.
- Lwt supported kqueue, but for no good reason, I just didn’t like Lwt. Part of it was how failure worked in Lwt and other part is just it didn’t fit my aesthetic. That isn’t a ding against Lwt, just personal preference.
- I wanted as much of it to be implemented in Ocaml as possible. As it stands now, the only C code is
libkqueue
which is a little shim to to allow kqueue code to run on Linux, otherwise everything is in Ocaml. - I didn’t like how neither Async nor Lwt really supported cancelling operations. I wanted that to be part of the framework, not an ad-hoc feature per library. Coming from Erlang, cancelling is really important to me and part of how I think about writing concurrent software. I was bummed that (last I looked) Eio explicitly rejected cancelling.
- I also wanted a little experiment of “what if the concurrency library exposed a syscall interface like an OS?” So a lot of the interface is meant to look low-level (I don’t think this idea really panned out or made Abb meaningfully different).
- I also just like having my own frameworks.
Add a dash of naivete, “how hard can it be to build a concurrency framework?”, I started my own. First commit was Mar 9, 2013.
Much of the concurrency monad is based on an unreleased library called Fut
by @dbuenzli
Over time, Abb matured to where I could use it in my personal projects. And by the time we decided to make Terrateam, I felt it was good enough for production. And it’s been running production traffic for a few years now.
One, unexpected, benefit of Lwt and Async existing in the community is that adding a third one isn’t that hard. Almost all libraries that want to be used support both, and that usually means that they have a generic interface. Cohttp and Dns are examples. So I could use existing libraries for things I didn’t want to or don’t feel I could reasonably implement myself.
I’ve also used Abb as a foundation my web framework called Brtl (pronounced Brutal) which is both a backend framework build on Cohttp and a frontend framework built on Brr. It really doesn’t do anything fancy, like Dream, it’s pretty low level and focused on being simple.
The good:
- It works! At least, for me.
- Given that I wrote very single line of code, debugging and bug fixing (which is less and less) is very easy. I also have a really great mental model of how it works.
- I like that I can just cancel a whole graph of async work if it’s no longer needed.
- The future’s library works in FreeBSD, Linux, and JavaScript.
- The test coverage is pretty darn high. This is because it’s a pretty intricate thing to implement so I had to implement a lot of tests to stay sane.
The bad:
- Performance is not anything special. I don’t think this is a fundamental flaw, it’s just that it is as fast as I need it to be right now.
- Some of the API is a awkward if you don’t know the system. Or names are long, like
Abb_future_combinators
. - The multi-target build story kind of sucks. I think that might be a bigger issue with the pds build system but for now in the web framework you have to use
Abb_js
rather thanAbb
for everything. - There are definitely some corners cut in, especially around file IO, but that’s OK, we don’t do much file IO.
- Explicitly takes advantage in that everything runs in a single thread. So implicitly going multi-threaded would probably break things.
Future work:
There really isn’t a lot of future work. For the most part: Abb is done. Or should I say the interface is done. Yes, it will need updates to fight bit rot, but there isn’t much more for it to do. It runs your code concurrently, the end.
However, as Ocaml5 becomes more of a thing, it will need to take advantage of that. I haven’t really thought about how to do it. One item I have in my to-do list is to evaluate if Picos could be a base layer for Abb. Abb is a layered approach so really you only need to implement the Abb_intf.S
interface and everything above that should Just Work (given single threaded semantics). I think any future work to support multi core will probably need an explicit “this crosses a thread boundary” API. Abb will get there, eventually, but right now it doesn’t need to.
Effects will obviously have a big impact, I have no idea what that’ll do for Abb. I hope I can transition it slowly to supporting effects but I don’t want to look at effects until it’s in the type system.
Some, perhaps, libraries of interest in the repo:
- Abb_scheduler_kqueue - The most used scheduler. It implements the Abb_intf interface.
- Abb_scheduler_select - A simpler select-based scheduler. This is meant to be used any place kqueue is not supported and also as demo.
- Pgsql_io - An implementation of the PostgreSQL protocol.
- Githubc2 - An automatically generated GitHub REST library generated from their JSON Schema. This actually has no Abb dependency, just implements the API serializing/deserializing.
- OpenAPI CLI - This generates a library (see Githubc2) from an OpenAPI spec. It is, absolutely, a bit of a rats nest, but it works. We chose to do code-gen for this because I didn’t want to be blocked when compiling based on different compiler versions as we’re using
Ast_helper
. Ocaml, the code, is more stable than Ocaml, the compiler API.
There is a bunch of other stuff in there. If you decide to poke around and have any questions, feel free to ask. I can promise: not every decision in there is well thought out or coherent.