Should the OCaml Stdlib ship an IO interface?

With concurrency so important in the OCaml ecosystem, and two different concurrency systems in wide use in the community, I notice that a lot of libraries define an Io (or similar) interface and type for functorization. My question is: would it make sense to introduce a standardized Io interface in the OCaml standard library so that these different libraries and applications can automatically be compatible with each other even at the functorized level? If not, why not?

Prior art: Rust ships a Future trait for exactly this purpose.

4 Likes

It is not exactly the same but I think you may find this interesting: https://github.com/ocaml/ocaml/pull/8937

There are various interfaces for synchronous I/O. I gather that you are focusing on asynchronous IO in your post, or in fact, the more general pandora box of “cooperative concurrency that interacts nicely with blocking calls”. I guess a first answer to the question could be that given the diversity of approaches, it is hard to think of one asynchronous interface we would want to put in the standard library.

Maybe there is still time to let the third-party library ecosystem accrue knowledge on the right ways to do it (Lwt/Async sound stable enough now, with for example ocaml-libuv for Lwt-backed libuv-provided async IO), but there are also new contenders in active development (for example luv for the IO-specific aspects and, the elephant in the room, Multicore-OCaml and concurrency-via-effect-handlers approaches).

Or maybe we want to start getting minimal/interoperable pieces of the several sensible approaches to this problem that would be useful to users and ecosystem coordination. That sounds like a lot of work, though.

1 Like

What is usually abstracted about IO is the monad and the common definition of the IO is:

type 'a t

val bind : 'a t -> ('a -> 'b t) -> 'b t
val return : 'a -> 'a t

Then, some libraries (eg. ocaml-cohttp or ocaml-git) use some syscall which use this monad. However, systematically functorize your code over this monad is not systematically good. Some others libraries prefer to define their own monad (I’m not sure to use this term in this case) which returns specific action. Then, they let the user to plug which syscall he wants to use from these actions. They are completely free from the monad and they can take some specific advantages specific to Async or Lwt. Like ocaml-tls, decompress or angstrom:

type t

val handle : t -> [ `Read of bytes | `Write of string | `Done ]

Some others, through, want to call a syscall at a specific time of the process - but they want to be free from the io used and they don’t want to use a functor. It’s another pattern which is start to be used by the MirageOS in some specific contexts:

type ('a, 's) io

type 's scheduler =
  { bind : 'a 'b. ('a, 's) io -> ('a -> ('b, 's) io) -> ('b, 's) io
  ; return : 'a. 'a -> ('a, 's) io }

type ('resolver, 's) gethostbyname = 'resolver -> domain -> (ipv4, 's) io

val connect : 's scheduler -> ('r, 's) gethostbyname -> 'r -> socket -> (socket, 's) io

An other case exists to plug a given implementation under a used interface at the link time. By this way, a library can use a concrete module and we can specialize it with a specific implementation (unix, lwt or async) but we never tried this pattern.

From what I know, definition given by dune for example about Future is good but, again, it’s not a systematic solution when some others patterns can fit better on your case (depending on what you want to provide). MirageOS has an interesting story about all of that because we tried to see which pattern can scale for some big projects.

My opinion is to say that any of these patterns can not be used largely for our purposes. At least, we should document them but a magic solution like a module type FUTURE should not be the only solution.

3 Likes

Before shipping a standard library for I/O + monads, maybe a really comprehensive (crowd-sourced, b/c too much work otherwise) library of examples for the ones that exist (Lwt? others?) could be useful ?

I don’t use monads myself, but questions come up pretty regularly on this board, that, if they were asked about normal threads+I/O, would be considered really elementary. They’re not elementary at all with monads, so it’s understandable people ask them, but really there should be a large library of explanations/howtos, I think.

Thanks everyone for the thoughts. Here is my thinking. If the only abstraction provided is:

type +'a t

val bind : 'a t -> ('a -> 'b t) -> 'b t
val return : 'a -> 'a t

Well then, that’s a monad. It’s not specifically about I/O. Now, we could argue that OCaml Stdlib should ship signatures for monad and a few other sensible categorical abstractions. But let’s assume that it won’t. In that case we can ask, what other operations should this IO type specify to make it more generally useful. To be quite honest I don’t know what exact operations other than the above would be a useful minimal set. I’ll investigate and post more later.

TL;DR at a minimum, you need select, all the file and socket operations, and a timer queue.

Back before pthreads, I used a number of different “threads-like” subsystems of varying completeness. They were “green threads” (not callback-based monad-like things) but even still, in each of them there was (a) a well-developed set of I/O and timer operations. Different ones had further support for (b) mutex/condvar stuff (again, built -on- -top- of the green thread abstraction).

I’d suggest that at a minimum, you’d want to provide #a above, and sufficiently well-developed, that it would be a straightforward[1] matter to write nontrivial I/O-intensive and complex programs.

[1] I’m aware that programming with monads is nontrivial. But it’s straightforward in the sense that you have to keep a clear mind and apply the rules with mechanical precision: do that, and it just works.

P.S. For those who wish to know more about I mean by “minimum”, the specific thread package I looked at in 1994 (and ported underneath of Python, though I have this vague memory of porting it under caml-light … just a vague memory) was Robbert van Rennesse’s “rvr_threads”. I also worked with GAtech’s pthreads and CMU’s cthreads, but rvr’s little thread package was really small and simple.

P.P.S. Again, I’m not saying you want to provide “green threads” in Ocaml. Far from it. What I’m saying is, the work to provide a decent set of I/O,timer,etc operations with a green threads package, is similar to what you have to do for a monad. So looking at what they provided in the former, can tell you what’s needful in the latter.