Eio 0.1 - effects-based direct-style IO for OCaml 5

Eio provides an effects-based direct-style IO stack for OCaml 5.0. It aims to be easy to use, secure, well documented, and fast. It consists of a generic cross-platform API, plus optimised backends for different platforms.

This 0.1 release is for early adopters to try it out and provide feedback. It works with OCaml 5.00.0+trunk at the time of writing, but as that is a moving target we suggest using 4.12.0+domains for now:

opam switch create 4.12.0+domains --repositories=multicore=git+https://github.com/ocaml-multicore/multicore-opam.git,default
opam install eio_main

There is a tutorial giving a tour of the main features:

  • Concurrent code without a need for monads or special syntax: no more >>=, let%lwt, Lwt_list.iter_s, etc. Just use plain OCaml code (|>, let, List.iter).
  • Concurrency primitives (promises, streams, semaphores), as usual.
  • Run multiple fibres on a single core, or distribute work across multiple CPUs.
  • Replace any OS features (files, networks, clocks, etc) with mocks for testing.
  • Structured concurrency, to prevent leaking resources (such as open file descriptors or fibres).
  • Automatic handling of cancellation, so that if one part of your program fails then other fibres are cancelled automatically until the exception is handled or the program exits with a stack trace.
  • Capability-based security. For example, a Dir.t only grants access within some sub-tree of the filesystem, and prevents escaping it using .. or by following symlinks out of the tree.

Various libraries are in the process of being ported to Eio. Examples include a Gemini client (including ports of angstrom, notty and ocaml-tls), and Dream (providing a direct-style API to users while using lwt-eio internally to integrate with existing libraries).

Performance seems good, especially on recent versions of Linux were Eio can take advantage of io_uring. For example, this graph shows how various HTTP servers cope with increasing load:

httpaf_eio is currently at the top (though note that the Rust server, rust_hyper, is not using io_uring, so this is not a completely fair comparison).

Useful resources:

Feedback, and PRs adding missing features, are welcome!

54 Likes

If anyone’s looking to contribute, then the various IO backends are a great place to go:

  • macOS is under development by @patricoferris over at https://github.com/ocaml-multicore/eio/pull/26
  • Windows needs some attention. If anyone would like to make a start on an IOCP async backend, that would complete the compatibility suite for ‘native IO’ (it works at the moment via libuv as a fallback).
5 Likes

An IOCP backend would indeed be nice! That said, if something relatively easy is desired in the meantime (and a readiness model suffices), I’ve had decent experience using wepoll which provides an epoll like layer on top of IOCP. The bindings I wrote for it can be seen at Add a wepoll backend by anuragsoni · Pull Request #5 · anuragsoni/poll · GitHub

I had plans for looking into writing bindings for IOCP itself, but I ended up not getting a windows device for myself so its been a little difficult to learn enough windows API to make progress on that front :sweat_smile:

3 Likes

I’m curious about the use of env in Eio. E.g. in the draft PR for Dream, it’s passed in a few places, e.g.

let () =
  Eio_main.run @@ fun env ->
  Fibre.both
    (fun () -> message_loop env#clock)
...

And also,

let message_loop clock =
  while true do
    Eio.Time.sleep clock (Random.float 2.);
    incr last_message;
...

I’m wondering, why does Eio.Time.sleep need an explicit clock passed in? Shouldn’t it be able to get a clock by performing an effect and having a handler provide it?

There’s no technical problem with doing that, but being explicit here is the goal. It means you can see from a function’s type that it uses a clock, and can control which one it uses. For example:

  1. For tests you might want a clock that uses a fake time.
  2. For times that are reported to untrusted users, you might want a low-resolution timer to make timing attacks more difficult.
  3. For battery-saving, you might want a clock that groups nearby events together to avoid doing multiple wakes and sleeps.
  4. Code that uses a clock may be non-deterministic, so you might want to avoid caching the result.
10 Likes

All very good reasons, but why force that on the users of a foundational
library, rather than let them choose whether or not they want to juggle
capabilities?

6 Likes

Thanks for explaining. It does make sense from a typing perspective. I imagine somewhere down the line we’ll have typed effects and that will do the same job. E.g. (made-up syntax),

val message_loop : Clock effect -> unit -> unit

For points 1 to 3 I may be mistaken, but they seem possible with effect handlers too, i.e. fitting in a custom handler closer to the callsite in places where more control is needed. For point 4, I’m not clear how we would know to not cache. Are you referring to the heuristic that the function takes a clock, so we should notice that and perhaps avoid caching its result?

This is an important question!

Giving a choice actually requires doing it this way. With the current API, you can always store the capabilities in global variables and use them that way if you prefer. However, if Eio itself provides globals for the “the clock”, “the file-system”, “the network”, etc, and libraries use them, then users don’t have a choice.

For example, the standard library’s open_in allows reading from files in the process’s default mount namespace, but it can’t read from a zip archive, a remote file-system, or a container namespace. Any library using the standard library then has the same limitation.

A second reason is that the goal of using capabilities is to make the code easier to read. Let’s say you create a Global module and start your program with:

let () =
  Eio_main.run @@ fun env ->
  Global.fs := Eio.Stdenv.fs env;
  main ()

Now it’s no longer easy to see which bits of code access the file-system - to check, you’d have to examine everything that transitively depends on Global. Which may well be fine; if you’re converting existing code that assumes everything may access files then this is no worse than before. But if Eio provides Global itself then you’d have to check everything that depends on Eio, which is a problem for everyone.

A third reason is that not all platforms work the same way. For example, a browser environment probably doesn’t have a file-system at all, while a MirageOS unikernel doesn’t have any concept of a default file-system (it may have several filesystems, or none at all).

Also, note that Mirage already works this way (in fact, it’s a lot more heavy-weight as it also requires functorising everything).

To summarise, if you have a library that doesn’t use global variables but you just want a single instance of something, it’s easy to do that. But if you have a library that does use globals then it can’t be used any other way.

I think the style Eio encourages works well though, so I’d suggest giving that a go first.

12 Likes

This is an important question!

Giving a choice actually requires doing it this way. With the current API, you can always store the capabilities in global variables and use them that way if you prefer. However, if Eio itself provides globals for the “the clock”, “the file-system”, “the network”, etc, and libraries use them, then users don’t have a choice.

That makes a lot of sense, thank you!

For example, the standard library’s open_in allows reading from files in the process’s default mount namespace, but it can’t read from a zip archive, a remote file-system, or a container namespace. Any library using the standard library then has the same limitation.

Agreed that it’s a problem. That’s more a limitation of in_channel
than open_in, imho, but I’ve said enough on that topic in the past.

Now it’s no longer easy to see which bits of code access the file-system - to check, you’d have to examine everything that transitively depends on Global. Which may well be fine; if you’re converting existing code that assumes everything may access files then this is no worse than before. But if Eio provides Global itself then you’d have to check everything that depends on Eio, which is a problem for everyone.

Also makes sense. I can imagine that instead of Env, one might pass
“globals” in a functor argument, to control granularity.

Thank you for the detailed answer.

7 Likes

Some features of Eio seem to be similar to features of Domainslib. For example, Eio.Stream.t and the bounded version of Domainslib.Chan.t. And the worker pool example in the Eio tutorial appears to implement the functionality of a Domainslib.Task.pool. Am I missing an important difference between the two?

Using typed effects would work if there’s a single clock (which will usually be the case). However, it isn’t so useful where you have multiple resources of the same type. An obvious example is directories, where knowing “this function accesses the file-system” isn’t all that useful; you want to know which directories it accesses.

e.g.

val run_webserver :
  www_root:Eio.Dir.t ->
  certificates:Eio.Dir.t ->
  Eio.Net.listening_socket ->
  unit

gives much more information than:

val run_webserver :
  unit -[filesystem,network]-> unit

For points 1 to 3 I may be mistaken, but they seem possible with effect handlers too, i.e. fitting in a custom handler closer to the callsite in places where more control is needed.

Yes, that could work if you only want to use a single clock below that point. There are a couple of minor issues:

  1. If the fibre forks, the new fibre would get the wrong clock. You’d need to intercept the Fork effect and install your handler there too.
  2. How long it takes to handle an effect depends on how many effect handlers are on the fibre’s stack. Eio only installs one handler per fibre but if you add your own it will slow things down a bit.

Both of these problems are fixable by extending Eio with support for thread-local variables (“fibre-local” variables really, I suppose). I’ve been a little reluctant to do that because they can easily be misused, but there are valid reasons to want them. For example, Dream assigns each request a unique ID and it would be useful to attach that automatically to any log messages generated by fibres handling that request.

For clocks though, I personally prefer it to be explicit. BTW, note that a log reporter will be a global and so can be configured with a clock once at startup and will use that to timestamp all log messages. So you don’t need to pass a clock around everywhere just for logging. In general, we tend to make exceptions to the capability rules for (outputting) debugging, tracing, monitoring, etc, where user code just provides information that may be reported, but doesn’t read it or depend on it.

For point 4, I’m not clear how we would know to not cache. Are you referring to the heuristic that the function takes a clock, so we should notice that and perhaps avoid caching its result?

Yes, that’s all I meant.

1 Like

One important difference is that Eio does I/O and domainslib doesn’t. For most uses, you should use Eio, but for certain CPU-bound tasks you may want to use domainslib as well as, or instead of, Eio.

The main design difference (which prevents simply merging them) is that domainslib will automatically migrate fibres between domains, whereas Eio ensures fibres stay in the domain that created them. Migrating fibres is often not what you want - see the Multicore Guide for some examples of what can go wrong - but it can lead to better performance in some cases.

To benefit from domainslib, you need: tasks that perform no IO (just calculations); that are either purely-functional or are written with an understanding of the OCaml memory model; and that are small enough that the overhead of sending them over an Eio Stream (which requires briefly taking a mutex) is unacceptable. We’re currently doing some benchmarking to establish exactly when it makes a difference and how much.

You should be able to combine Eio and domainslib (e.g. having an Eio domain serving web requests, but farming any CPU-intensive work out to a pool of domainslib workers), but for now you’ll have to write the connector yourself.

@kayceesrk may be able to say more about the benefits of domainslib.

5 Likes

Thanks for the explanation! Indeed I was wondering about potential duplication of efforts between the two libraries which were originally not scheduled to get released together.

This one I didn’t get – what are the pitfalls that require this when using tasks?

1 Like

These ones: Problems with Multicore Programming

3 Likes

Excellent reasoning. Thank you.

Having used Eio for a little while now I would highly recommend giving it a go. For example, I converted a (work in progress) stun client to direct-style very quickly and went from four opam packages to one! I had to implement some features which are either upstreamed or in-progress to being upstreamed (the client still uses my original fork of Eio).

Yes! There’s actually a much closer to upstream branch I’m working on that looks very like the luv implementation (coming soon…).

I’ve also been working on a multicore monorepo that uses the opam-monorepo tool to provide a quick way to get up and running with OCaml 5.00.0+trunk and some of the experiments and libraries in awesome-multicore-ocaml (e.g. the direct-style dream that uses lwt_eio). The idea being you should only need to:

git clone https://github.com/patricoferris/ocaml-multicore-monorepo
cd ocaml-multicore-monorepo
opam switch create 5.00.0+trunk
opam install dune.2.9.2
dune build

Trunk is a moving target so things might break, feel free to open issues if something does :)) Of course, if you just want the Eio libraries it is easier to install them on a 4.12+domains switch as mentioned.

3 Likes

Giving a choice actually requires doing it this way. With the current API, you can always store the capabilities in global variables and use them that way if you prefer. However, if Eio itself provides globals for the “the clock”, “the file-system”, “the network”, etc, and libraries use them, then users don’t have a choice.

This doesn’t make sense to me. Can the environment from one call to Eio_main.run be used within another call to Eio_main.run? If not then how do you make a version using global variables? (Or perhaps only one simultaneous call to Eio_main.run is supported – in which case how is that enforced).

If you can make a reliable version with global variables, then the capabilities aren’t giving you any guarantees and could easily be added as a higher-level layer on top of a library that didn’t use them.

It’s also a shame to see classes and object types appearing in what aims to be a foundational library. I would have thought those could also be added as a separate library for people that are happy to use those language features. Generally, I think there needs to be a less opinionated core library from which more opinionated libraries can be built, otherwise I suspect we will end up with multiple incompatible libraries for asynchronous IO again.

11 Likes

Eio_main.run is like Lwt_main.run. It is typically used once in an application, at the start, and everything runs inside it. You’d use it like this (which I would however consider bad style):

module Globals = struct
  let fs = ref None

  let init env =
    fs := Some (Eio.Stdenv.fs env)
  
  let load path =
    Eio.Dir.load (Option.get !fs) path

  let save path data =
    Eio.Dir.save (Option.get !fs) path data ~create:(`Exclusive 0o666)
end

let main () =
  let data = Globals.load "input.dat" in
  Globals.save "output.dat" data

let () =
  Eio_main.run @@ fun env ->
  Globals.init env;
  main ()

Whether the saved value works if you end the main-loop and then start another one would depend on the backend implementation. It’s certainly not guaranteed to work.

If you can make a reliable version with global variables, then the capabilities aren’t giving you any guarantees and could easily be added as a higher-level layer on top of a library that didn’t use them.

OCaml allows any module in my dependencies to run a static initialiser, which can use Obj.magic, call C functions, etc, so we can’t really guarantee anything about any non-trivial program. It would be nice if the compiler had some kind of safe mode where it would alert you about unsafe code, but it doesn’t yet.

So when Eio (or any OCaml library) talks about ensuring things, it means assuming you follow some reasonable rules, even if they’re not enforced. Roughly, this means that if you call an “unsafe” function, you need to verify things yourself. From a capability point of view, certain extra functions in the stdlib are considered unsafe, such as open_in, which gets access to the file-system from nothing.

Note that Eio_main.run itself is not a capability-safe function, because gets access to the file-system and network from thin air. If you run the compiler in safe mode, it should therefore flag that call as something you need to audit manually. Which shouldn’t be surprising - obviously you need to read your program’s main entry-point to have any idea what it will do.

To get a bound on what it can do you need to follow the env argument and see where it goes. In the example above, it gets stored in Global, so you need to audit everything that can access that. That would likely be a lot of work, which is why using global variables is generally considered bad style.

It’s also a shame to see classes and object types appearing in what aims to be a foundational library.

Why? Eio uses objects internally, while mostly exposing a functional API. Lwt does the same thing in places (e.g. Lwt_process), and all Lwt code uses Lwt_engine in the end. It doesn’t seem to cause any trouble.

4 Likes

FWIW I share a bit @lpw25’s sentiment here.

A cursory look at eio’s design looks like a rather monolithic design cramming as much feature as possible to be able to replace lwt code by direct-style code along with a few more goodies on the way, which I’m not necessarily convinced about yet.

One of the promises of effects was their modularity/compositionnality, so the way stuff is built here feels a bit backward, maybe trying to tie in everything directly with operating systems interfaces is not such a good idea.

Taking the sole example of fibers, I would certainly like to see a core set of effects/primitives to deal with them in the Stdlib but in a way that allows me to interpret them in possible different ways than is exposed here.

7 Likes

I had planned to experiment with Eio for a while so I’m particularly happy to see it released right when I was getting to my project! =)

I’m trying to use a tun device. I’ve started with the Tuntap library which provides a Unix.file_descr. I see no way to “convert” that file_descr to something for Eio. Is there anything for that? Should I instead avoid Tuntap and open the tun directly from the Eio world?