You’re right that I’d prefer people not to use the low-level APIs. I consider them to be similar to the Obj
module in OCaml - there if you need them, but not for general use. I’ll try to explain why that is, and perhaps you can spot a flaw in my logic.
When people ask me why they should use OCaml, I’d like to tell them that it has a firm basis in mathematics, and makes it easy to reason about their code.
So let’s forget about capabilities and look at reasoning instead. Usually there’s too much code in a program to read all of it (especially when considering libraries, the OS, and remote services), so functions need some kind of summary, such that we can just read the summary most of the time. This includes the function’s type, and maybe its comment, or even a Gospel specification.
As a simple example, imagine trying to specify a negate
function (pretend this function has a complicated implementation that we don’t have time to read):
let a = ref true
let b = ref true
(* [negate x] sets [x] to its negation. *)
let negate x = ...
let () =
negate a;
negate b;
assert (not !a);
assert (not !b)
We can use an informal comment, as above, or try something formal. For example, here’s a Hoare triple for it:
forall x, v. { !x = v } negate x { !x = not v }
This says that if !x = v
and then you call negate x
, then afterwards !x = not v
.
This might seem like a fairly comprehensive specification, but in fact it’s mostly useless. Looking at the above program, we can’t say whether either assert will pass! For example, we know a
is false immediately after the negate a
, but we don’t know what negate b
does to a
, so it could be anything by the end.
As programmers, we probably assume negate b
won’t affect a
, but what is the rule we’re following when making this assumption? I can think of a few possibilies:
- A function should be implemented in the obvious way.
- A function should only use the minimum state needed to do its job.
- A function should only use state reachable from its arguments.
Maybe you have your own idea; I’d be interested to know how people think about this!
negate
is simple, but here’s a trickier example:
val handle_request : request -> response
This is a pure function that just calculates an HTTP response from a request. Now, let’s say (to use the example suggested earlier) someone decides to add some Redis caching. Maybe not in this function itself, but in some function it calls indirectly, they make some calls to http://redis/
. And they don’t want to plumb through anything (that seems to be your worry?), so they don’t change the signature or the comment of handle_request
.
So: is that a reasonable thing for a programmer to do?
It violates all the rules above:
- It’s not obvious it uses Redis, since it didn’t before.
- It doesn’t need to use Redis.
- Redis isn’t reachable from its arguments.
If we say it’s OK to do this, then we can’t reason about programs at all; any function could do anything! A testing framework could one day delete all our files, or a logging library could download and execute code mentioned in the log message!
In this example, let’s say I’m asked to work on the server code. I make some changes locally and add some unit-tests. But unknown to me, every time I run the tests, I’m corrupting the production Redis cache!
And this is really hard to track down. The production team is pulling their hair out trying to find out why the cache keeps getting messed up, but they have no easy way to find all the places where the code uses Redis. And in fact, the bug isn’t even in their version of the code.
Now, it might strike us as odd that this problem can occur at all. OCaml is a memory-safe language, which means that functions can’t just corrupt each other’s memory, and Redis is an in-memory database, yet the memory is being corrupted. The reason is that when people say “memory-safe”, they are usually only talking specifically about local memory (i.e. memory in the OCaml process’s heap), not remote memory such as in the Redis service. And indeed, if the production service, the tests, and the cache were all a single OCaml program, communicating using function calls, we wouldn’t expect to have this kind of problem; it would be obvious that the tests shared the cache with production, because we passed the same cache to both of them.
“memory-safe” also usually means local volatile memory (i.e. RAM), not non-volatile memory (e.g. SSDs); we shouldn’t be surprised if one OCaml function corrupts another function’s files. So one solution to our problem would be to make a “fully” memory-safe system (and this is what a capability system is).
But this is not the only option. I expect that most people would say that the real problem is that our Redis service wasn’t properly secured. To make things easy, we give resources global names (like “redis”), which avoids the need to plumb references around. This means that anything that should be using Redis can do so easily. But it also means that anything that shouldn’t can do so too. The general solution to this is to add an access control layer, and this is what Redis does.
Doing this brings a couple of problems though. Obviously it’s a load of extra work and complexity, and we now need to include Redis’s ACLs in our reasoning system. But also, we now need to plumb the password through our service:
val handle_request : redis_password:string -> request -> response
So, we get the same cost as when using references, but without the benefits.
Back to our negate
function. The problem of how to specify functions in a modular way was eventually solved (and rather elegantly) by the invention/discovery of Separation Logic. Here’s an updated specification:
forall x, v, p. { (!x = v) * p } negate x { (!x = not v) * p }
The idea is to partition the state of the system (e.g. the set of allocated ref-cells) into two non-overlapping parts. The spec above says: if the state can be partitioned such that !x = v
is true of the first part and p
is true of the second, and you then do negate x
, then afterwards the state can be partioned such that !x = not v
is true of the first part and p
is still true of the second. In order words, negate x
only affects x
.
Functions having * p
on both sides can be composed safely. For example, when considering the effect of negate a
, we can use the specification like this:
{ !a = true * !b = true } negate a { !a = false * !b = true }
Composable functions are so useful, and non-composable ones so useless, that in fact we just redefine our notation so that every specification is understood to have an implicit * p
.
So being able to reason about code requires plumbing through effects somehow. We must mention them somewhere. There are various ways to do that. Passing an argument is the most obvious, which is what Eio encourages:
val handle_request : redis -> request -> response
If there are several things, we can bundle them together if we like:
val handle_request : config -> request -> response
If Redis is getting corrupted, that makes it a little harder to find out where, since config
is likely passed to more places than redis
, but it still gives us a bound on the behaviour and a set of places to look.
There are other ways of doing it, such as using an IO monad:
val handle_request : request -> response io
For example, Haskell uses an IO monad for all side-effects, allowing them to partition their functions into two groups: those that can’t do anything, and those they can’t reason about.
Incidentally, if we were using Lwt then we’d have to plumb Lwt.t
through handle_request
when adding Redis support. That would be more invasive than adding config
with Eio, while still not putting any bound on behaviour.
We could also make a more restricted “redis monad”:
val handle_request : request -> response redis
But this is just more complicated than passing the argument, and composing multiple monads is tricky.
Finally, if OCaml had implicit arguments then we could pass redis implicitly. We’d still need to annotate handle_request
to say that it used Redis, but the code wouldn’t show where it used it. This is a bit shorter, but also less clear, and quite bad from a security point of view.
In summary, the reason why I want people to avoid the low-level APIs is because we can’t reason about such code. Ideally, the compiler would warn about it (and about Obj.magic
). Giving up reasoning, clarity and ease of debugging are not small matters, and I’d need to see some pretty impressive benefits to consider it.
(Minor correction: I think Redis actually uses its own protocol rather than HTTP, but the same logic should apply)