Hey! We just launched a new mailing list powered entirely by OCaml unikernels. The website (itself a unikernel) is at https://mailingl.st. You can subscribe to ptt@mailingl.st by sending an email to ptt-subscribe@mailingl.st if you’re interested in the development and deployment of SMTP-related unikernels.
Fair warning: this is still a public test mailing list for now. In the long run, it will focus on our ptt project.
The SMTP protocol: a long and winding road!
In the beginning, email
It all started with Mr.MIME, our library for decoding and encoding emails. It’s a synthesis of the relevant RFCs, but more importantly it’s been battle-tested against real-world emails from the IEEE, Enron, KVM and, most recently, the caml-list.
This work also let us build Hamlet, a database of valid random emails generated using a fuzzer.
Under the hood, Mr.MIME relies on unstrctrd for decoding the most general form of values found in an email (with internationalisation support via rosetta) and prettym for encoding emails while respecting SMTP constraints and (Comment) Folding Whitespace handling.
Next, the protocol
Then came colombe, our OCaml implementation of the SMTP protocol. It uses ocaml-tls for STARTTLS support.
The protocol is supposedly “simple” (though the Internet always has surprises in store), but from day one we designed colombe to be independent of any scheduler and network layer. That way it slots right into unikernels without friction.
Finally, legitimacy
On top of these core components, we built several email security layers:
- ocaml-dkim handles signing and verifying email integrity in a streaming fashion (both for verification and signature generation)
- uspf verifies sender identity and, like most of our libraries, stays independent of any scheduler or DNS implementation
- ocaml-dmarc automates DKIM and SPF verification, stamps emails with the result, and checks alignment across a domain name
- ocaml-arc lets you verify and sign emails to complete a chain of trust when an email passes through multiple SMTP servers (which is exactly what happens with a mailing list)
We wrote a short article about all of this here.
All in the form of unikernels
Our first experiments already showed that we could handle emails with MirageOS unikernels. But we also hit real limitations: memory leaks, security vulnerabilities, and build issues.
So we decided to start fresh, and take the opportunity to fully embrace OCaml 5 and effects. We rebuilt the key pieces from scratch:
- a new effect-based scheduler: Miou/Mkernel
- a much more complete TCP/IP stack: Mnet
- a new FAT32 file system: Mfat
ptt is built on this new stack, and so far we haven’t observed any memory leaks (thanks to mkernel-memtrace for tracing memory usage, viewable via memtrace-viewer). The CVEs related to mirage-tcpip were taken into account during mnet’s development, and the build story is much simpler now. A GitHub action can build and actually run the unikernel to test it, as you can see with mnet.
Other unikernels using this approach are available too. If you’re curious, check out this tutorial on creating a unikernel in OCaml.
Deployment
ptt also tackles the deployment question. We have an article presenting the “stateless” aspect of ptt. We’d also like to (re)introduce Albatross, our secure unikernel orchestrator, and Mollymawk, a web interface for deploying unikernels (which is itself a unikernel).
More broadly, this is what our cooperative is working towards: we really want to improve the user experience, whether you’re a developer or a deployer. We believe that actually developing, deploying, and using our unikernels is the only way to get them adopted more widely. So make sure to follow us on these projects too!
Usage
Along the way, we found it really helpful to have a tool that lets you track every stage of an email’s lifecycle. That’s how blaze came about: a Swiss Army knife for handling emails.
It’s still experimental, but it already lets you:
- use our archive system (generate, read, index, etc.)
- handle other archives such as
mboxormaildir - communicate via the POP3 protocol
- sign and verify emails (DKIM and ARC)
- build emails from the command line
- send emails
- run a small local SMTP server
blaze is how we iterate on our library APIs and validate implementations. It’s experimental, but it’s gradually turning into a full email client.
Archiving & Indexing
We’d also like to present the stem project, which extracts word roots from a document (such as an email) and tokenises them to get something analysable without the complexity of natural language. This tokenisation is what powers our small bm25 search engine. You can see results here.
This is also what drives our caml-list search engine, available as a unikernel: blame, which you can try at https://caml-list.robur.coop (powered by vif).
Beyond search, there’s also email indexing via Message-IDs. For that we built bancos: a persistent radix tree in OCaml that supports parallel access! More details here.
Finally, our indexing system uses the PACKv2 format (the same one Git uses to store objects), implemented by the carton library. It has proven its stability through the ocaml-git project, so we decided to reuse it for archiving emails (much like public-inbox did, though in a different form).
Conclusion
Thanks to all this work, OCaml now has a solid set of email-related projects. This journey started back in 2016 and there’s still a long way to go, as we always aim to offer robu(r)st, battle-tested solutions. Unlike some implementations in other languages (though we are in discussion with folks in the Rust community), ours actually adhere to the standards!
It may not seem like a big deal, and you won’t see any major difference when just exchanging emails, but we believe this approach paves the way for a better internet. In the form of unikernels, it represents a genuine reclaiming of the means of communication!