Eio library vs threads library for concurrent programming

Hi,

In the motivations of GitHub - ocaml-multicore/eio: Effects-based direct-style IO for multicore OCaml it says:

“The Unix library provided with OCaml uses blocking IO operations, and is not well suited to concurrent programs such as network services or interactive applications. For many years, the solution to this has been libraries such as Lwt and Async, which provide a monadic interface. These libraries allow writing code as if there were multiple threads of execution, each with their own stack, but the stacks are simulated using the heap.”

While I understand the advantages of eio over lwt/async, what are the advantages of eio compared to the Threads library? When using Threads, the Unix operations are not blocking anymore.

A big one is that Eio is based on “green” threads (scheduled by the OCaml runtime), which is much lighter and efficient than the OS-level threads exposed by the Thread module.

Cheers,
Nicolas

1 Like

Another quick question regarding eio. The README mentions the use of “effects” but looking at the type signatures of many of the EIO functions I don’t see any effect types used.
Where are those effects?

Final one: What happens if you use a Unix function (e.g., reading a file) inside one of the fibers? I know there is an Eio.File.xxx to read a file, but if you gradually move code to use Eio and there are remaining calls to Unix. functions? What happens? The whole thing is blocked or another fiber can take over?

My understanding is that Eio (just like lwt and async) is useful if you expect heavy concurrency and a heavy focus on IO. For example a web application with many active clients (stuff like websockets or long running connections) are good use cases.

Threads are perfectly fine if you have significant computation in addition to some IO. On a Linux machine they will easily scale to thousands or more connections per second. Thread pools can increase scalability further. With OCaml 5 threads can also (if spread on multiple domains) be used for parallelism.

3 Likes

Effect handlers are used to implement Eio’s lightweight threads, but are not exposed in the API.

It will block the whole domain and no other fiber will take over. There is no preemption: lightweight threads implemented using effect handlers use cooperative scheduling, which Eio functions implement.

3 Likes

The Unix operations are normally blocking when using threads from the Thread library: that is the whole point. But when one thread is blocking on i/o, the OS’s thread scheduler will let some other Thread module thread proceed instead, if there is one. (As a side issue, within any one domain only one Thread module thread can take over the OCaml runtime at a time, so Thread module threads cannot of themselves provide parallelism for work which is cpu bound rather than i/o bound.)

Conversely, when using OCaml effects, or when using Lwt or Async, the unix operations do not block on i/o. Instead, when EAGAIN or EWOULDBLOCK is encountered, control is returned to the program loop to allow operations on some other file descriptor which is ready to proceed. The concurrency is co-operative.

Edit: The latter means that when conducting i/o operations using effects, or when using Lwt or Async, you should use the i/o functions provided by the library in question, which will perform non-blocking i/o, rather than those in the OCaml Unix module, which generally will not.

2 Likes

Yes sorry, what I meant is that Unix operations when using threads are indeed blocking, but they don’t block the whole program; other threads can be resumed, whereas with eio, if one fiber is calling a Unix operation from the Unix module, the fiber will be blocked, but also the whole event loop so the whole program (well actually the current domain when using multiple domains) will be blocked.

When you have lots of legacy OCaml code already written using the standard API like Unix, Sys, with such code doing lots of computation and IO, it seems better to use Threads than eio that would require to change all those Unix calls to eio calls; otherwise the whole program (or domain) will be blocked, whereas with threads other threads can take over.

1 Like

The eio.unix package does contain a Eio_unix.run_in_systhread function which allows you to make calls to these blocking Unix functions and not block the Eio loop. In fact, some of the Eio backends use this and simply wrap Unix functions (e.g. getaddrinfo). It may not be the most efficient and doing your own thing with Threads directly might be better.

2 Likes