Eio 0.1 - effects-based direct-style IO for OCaml 5

In OCaml, “installing” a handler is what creates a new fibre in the first place. When you call Fibre.fork:

  1. It performs the Fork effect, suspending the current fibre.
  2. The effect gets handled in a backend’s scheduler. e.g. at eio/lib_eio_luv/eio_luv.ml at 9e1df71bd6d95e962b207f3b8c849215ac01d48a · ocaml-multicore/eio · GitHub
    (note that new_fibre here is some metadata for the new fibre, not the fibre itself)
  3. The recursive call to fork then uses match_with to create a new fibre, with the same IO effect handlers.
  • How long it takes to handle an effect depends on how many effect handlers are on the fibre’s stack. Eio only installs one handler per fibre but if you add your own it will slow things down a bit.

Isn’t this a rather important limitation of effect handlers in OCaml? Handlers are supposed to make it easy to control effectful code in a fine-grained way. If their performance in real-world scenarios is not scalable, does this mean there is more work to be done here? Or is this specifically about Eio’s fibres?

This is about OCaml’s fibre implementation. An effect handler works a bit like an exception handler, with the effect being passed to the nearest enclosing handler, which may then re-perform it if it doesn’t want to handle it itself.

Eio is very conservative in its use of effects, doing the minimum needed to provide concurrency. There is one handler per fibre, so it is as fast as possible. Handlers are pretty quick, so a bit of nesting may be fine, but Eio doesn’t force that cost on you.

To be clear: this scales fine to any number of concurrent fibres. The slow-down only happens if you want to nest handlers.

The only good use I’ve found myself for nested effects is in Angstrom, where a handler is used to provide the legacy callback API on top of the new effect-based core. That function shouldn’t be needed in new code though.

1 Like

As Leo White touched upon here, there is an overlap between capabilities and typed effects. I’m wondering how big this overlap will be in practice, and if this is the case, I don’t think it’s very elegant to have several ways to expres the same thing - and it becomes outright confusing for users; is my IO-capability an effect or an argument?

To my understanding, the Std-lib will be rewritten to use typed effects - so when one opens a file within a function (at least when not using Eio), it will become part of its interface.

Daniel talks about how users shouldn’t be using effects directly - but is the compiletime specification of allowed effects not exactly one of the most exciting qualities in the context of OCaml? This is at least what I’m most excited about - that I’ll be able to restrict a big part of my codebase to be pure, and that it becomes obvious when this is not the case.

I love the prospect of making effectful dependencies explicit - but I don’t want two ways to do the same.

1 Like

Indeed, what I meant by that is that one should strive to have its abstractions to be polymorphic over effects, i.e. let the client choses the one it wants in order to satisfy a need (e.g. give me more bytes to process).

A good way of doing this is to let the client specify these needs via functions (which could also turn out to be pure !).

Once given to your abstraction, the concrete effects chosen by the client in these functions will of course propagate to uses of your own abstraction when it uses them.

How could these functions be pure? I’m imagining them throwing an effect, and then some interpreter handling them - or do I misinterpret?

let more_bytes () = None

Most of the time yes.

Yes, I think this is a very interesting question. Is there a spec for typed-effects yet? The most recent information I have is from Leo’s talk, but that was ages ago.

One unexpected (to me) result of writing Eio was that despite being an effects-based IO library, the Eio API doesn’t actually define any IO effects! The four effects we ended up using were Suspend (waiting for promises, etc), Fork (for Fibre.fork), Trace (for debugging) and Get_context (fibre-local context, used for cancellation). Anything that implements these four should work as an Eio backend.

Everything about IO is entirely contained in the various backends. For example, Eio_linux internally defines an ERead effect, and provides (via env) a file-system abstraction that performs this effect. But Eio_luv instead uses a generic Enter effect for this. This is nice and modular, but not how I originally expected it to work!

1 Like

In the lazy initialization pattern, do you really need that a 3rd party library is parametric in the notion of async IO engine you use? Is basic systhread IO not sufficient for this generic purpose?

Regarding thread-safe Lazys, if it is important for this thread then I would like to mention the design I posted here https://github.com/ocaml-multicore/ocaml-multicore/issues/750#issuecomment-978125441, which could still use feedback from programmers and also volunteers to help (let me know).

Apologies in advance, I have not much valuable to contribute, but would love if someone could explain some of the basics / intermittently summarize the discussion for folks like me (or even point to reading material).

In my limited understanding:

  • effects are being implemented as one-shot delimited continuations
  • delimited continuations, like exceptions, should allow for non-local handling, i.e. from the arguments of a function, you have no clue what effects it may perform OR how they will eventually be handled
  • eventually the type-system will be used to track what effects a function is performing

So, the questions in my mind are:

  1. What advantage is there to passing capabilities explicitly? They seem to have some overlap with effects - but shouldn’t they be determined by whatever handler is installed, i.e. non-locally, which is the intended use of effects? Is this just a temporary stop-gap / workaround for not having typed effects and forcing programmers to track effects/capabilities?
  2. Re:

Wouldn’t any mocking just be about whatever handler you install outside - so not passing capabilities explicitly would actually be perfect for mocking? i.e. install a mock-network handler, mock-filesystem-handler on the outside and test the code?
3) Do effects need to be tracked in the same type-system? i.e. do you need to unify the effects with the types of values in the rest of the language? Coz honestly, this seems like too much cognitive overhead:

val run_webserver :
  unit -[www_root: Eio.Dir.t;
         certificates: Eio.Dir.t;
         network: Eio.Net.listening_socket] -> unit

Maybe it’s only a matter of how the type is printed/shown, but also more theoretically, can it not be a separate unification i.e. one for types and one for effects (effects will always be unit in type)? Maybe resulting in two separate signatures for the function like this:

val run_webserver : unit -> unit
eff run_webserver : www_root -> certificates -> network -> tty_io

So it lets you see AND track separately the type and the type of effects/capabilities the function has. You could even have compiler flags switching the “effect system” on and off, to choose between pure/impure language?

1 Like

For the usecase of limiting functions sideeffects using typed effects, I agree that it would be an ugly interface to include a lot of effects in the functions types. Though I imagined a different solution.

For users that don’t care about what effects their code has, they just use the ML-effect “->” in their mli, or any other less general named set of effects. I guess this would be bad style - especially for libraries.

In the case where you want to be exact about what effects are possible - I was imagining a semantics like that of polymorphic variants, with structural subtyping. So one would name a set of effects, and include this set in other sets. The downside of this though, is that you need to go to definition of several sets of effects to see the full set - or let merlin list the set.


EDIT: Probably it won’t be bad style to mark all your functions with a common set of effects in a library, as long as all the effects are handled by the library. This would avoid breaking user code when the set of internally used effects got updated - but I guess this also depends on a feature to make a named set of effects abstract in a module?

Wouldn’t any mocking just be about whatever handler you install outside - so not passing capabilities explicitly would actually be perfect for mocking? i.e. install a mock-network handler, mock-filesystem-handler on the outside and test the code?

My understanding is that there isn’t an effect defined for each system call (because you wouldn’t want to pay that price? and different backends would not provide the same syscalls?). Rather eio uses effects only to simulate direct style and hide the callback hell that is async io… but then, this implies that the typed effects are going to be uninformative about what side effects eio is performing.

Besides, even if we could intercept the syscalls with an effect handler, how would we mock opaque types like file_descr or out_channel in response to an Open effect? The mock effect handler will also need to check that effects like Write (file_descr, ..) are actually targeting the right thing, so you have to reimplement a dispatch that comes for free with a mock object.

I finally did my own experiments with effect handlers, only to discover that I had naïvely overlooked their semantics. It helped me realize that you can’t setup local effect handlers and also expect to use fibres / green threads inside them.

12 Likes

Eio’s “fork” seems to be a misnomer, it’s more like spawning an on-the-side helper thread to run at scheduler scope. A true fork, in which both paths of execution inherit the handlers at fork point, looks to me like the amb/nondeterminism effect, which requires multishot continuations.

Fun read, thanks for exploring

@art-w this is an excellent blog post about effect handlers, could you maybe consider posting it into its own thread?

@art-w related to your article, it’s worth pondering over the fact that effects are isomorphic to monads. Anything monads can do, effects can do too. The problem is that monads are fully type-safe, and until effects gain types in OCaml, they’re unsafe and therefore their proliferation can turn OCaml into an unsafe language. That’s why IMO we should really only use it for concurrency for now, and even then we should be careful.

1 Like

It would be nice to see a small test repository that demonstrates the tradeoffs associated with alternative APIs. This would make arguing the pros and cons more fact-based and less contentious. In my experience it is generally possible to avoid the object system by using functors and first-class modules. I find modules easier to reason about and that typically compensates for any minor syntactic overhead. Extending or overriding functionality in libraries does not strike me as a frequent enough occurrence to worry about verbosity.

When it comes to reflecting capability-based security in the type system, have you considered using polymorphic variants together with phantom types? This would allow you to capture required capabilities in the type of a function, e.g. whether writes to the filesystem may occur.

It may be a good idea to think about who the intended audience for Eio is. I’ll venture to say that it will never be something that beginners should have to learn before being able to write simple OCaml code. I see it as a tool for experienced and disciplined programmers who care about both safety and performance in large, complex applications. Having to manage capabilities or understanding advanced type system features doesn’t strike me as an overly large burden for this audience. Many of them are already using monadic implementations, which may be even more difficult or cumbersome. I’d personally be completely fine with, for example, using functors to instantiate a whole module within a certain capability environment to avoid having to pass capabilities along through functions.

7 Likes