[ANN] iostream 0.1

Hello folks,

I’ve just released iostream 0.1. iostream is a library that provides composable abstractions for input/output streams of bytes. Here composability means that users can create their own streams, which makes a Iostream.Out.t more powerful than the standard out_channel because it is an abstraction that might perform compression before writing to a Buffer.t, or writes to a socket, or write nowhere, or send the bytes to multiple other outputs.

There already exist some similar abstractions in the ecosystem, such as Batteries’ BatIO, an object based version of this in ocamlnet, and probably uncountably many other projects. I have ancestor versions of this in many of my own projects. This is my way of dealing with my failure to implement RFC19 in OCaml itself (a worthy read about tradeoffs and use cases, for anyone interested). In other languages you can find Rust’s Read, BufRead, and Write traits; and Go’s Reader and Writer interfaces.

A design note: in iostream there is a separation between Iostream.In.t (which is basically like a unix FD or rust’s Read: it gives you read: bytes -> int -> int -> int), and Iostream.In_buf.t (the equivalent of rust’s BufRead: it has its own buffer and gives you access to it. Unlike in_channel’s magic methods for input_line, you can inspect the buffer to look ahead and consume exactly the amount of input you need, no leftovers).

The library is under the MIT license. The online docs are here.

29 Likes

Looks like an interesting library to me. I have a few comments or questions:

  1. Is there going to be support for bigarrays or do you not support them on purpose?

  2. The documentation states that close “must be idempotent” for output streams and is “idempotent” for input streams. Could you clarify the distinction here? Also, did you consider just making sure that your modules never call these functions more than once? Seems like that would be more helpful to users of the library.

  3. Did you consider splitting the parts that depend on Unix into a sub library?

1 Like

These are good questions, thanks. It’s not easy to write interfaces with such a general purpose scope :sweat_smile:

  1. I didn’t see a good way to support both bytes and bigarrays in the same interface ­— asking implementors to have both read and read_bigstring would be annoying, for example. However a good chunk of the ecosystem does rely on bigarrays so it is an important topic. I can imagine two solutions on the top of my head:

    • have read_bigstring, write_bigstring with default implementations just going through an intermediate layer of bytes, and the possibility for the implementor of the stream to write a custom version
    • parametrize the stream types with the underlying type, i.e have bytes Iostream.In.t as well as bigstring Iostream.In.t. But here the difficulty is that all the convenience combinators in the library become impossible to write, or specialized just for (say) the bytes` version.
  2. close should be idempotent for both, i.e closing twice shouldn’t fail. The reason is that it’s just too hard to keep track of whether you closed already, especially if you mix explicit closing (closing a connection) with resource handlers such as with_in : … or Fun.protect that will ensure proper disposal of resources.

  3. How do you provide an interface with a as_fd : unit -> Unix.file_descr option in a sub-library? It has to be part of the core interface, or not be there at all. This part comes from the initial goal, in RFC 19, to replace standard channels (which provide things like seek and pos), but I agree it’s annoying to depend on Unix.

1 Like

Thanks for making this a lightweight, standalone library! The lack of agreed Go-like Reader/Writer interfaces like these is one of OCaml’s major pain points to me; they make it so easy to pipe, tee, (de)compress, reencode, colorise or otherwise process any flow of bytes between any two components.

Nice having an as_fd function from the start: Have you considered whether as_fd should mean “direct wrapper of a Unix FD” or just a weaker guarantee of being backed by one (or a finer grained distinction)? It can be hard to compose writers that use terminal capabilities (isatty and friends) when only one of them can wrap the FD directly.

A possibly contrived example: I once patched a Go library for updating terminal output (like progress bars) to enable updates on FDs passed through an optional interface (basically as_fd) rather than simply check for direct wrapping. This allowed me to compose it with a VT color sequence emulator for Windows (which didn’t support VT sequences yet), and that in turn allowed me to use common functions for colored output cross-platform. However, this would have broken if the TUI library had tried to bypass the given writer, assuming that it owned the FD directly.

Also nice to see In_buf with direct read access to the inner buffer.

1 Like

I’ll provide some opinions about this as well:

However a good chunk of the ecosystem does rely on bigarrays so it is an important topic.

I think OCaml desperately needs pinning, like the CLR has, so that we can fix the address of string/bytes for at least the duration of a C call, along other uses. This would reduce the use cases for “bigstrings”.

How do you provide an interface with a as_fd : unit -> Unix.file_descr option in a sub-library?

Ideally you wouldn’t, instead Sys or some other stdlib module would provide a datatype representing any “OS handle” (file descriptors, Windows’ HANDLE, whatever) and Unix.file_descr would be either an alias or created from it. Reminds me of this PR.

2 Likes

Thanks for making this a lightweight, standalone library! The lack of agreed Go-like Reader/Writer interfaces like these is one of OCaml’s major pain points to me

Definitely agree, it’s hard to go back when you’ve used this kind of
thing (for me it’s the rust ones, but same deal).

Nice having an as_fd function from the start: Have you considered whether as_fd should mean “direct wrapper of a Unix FD” or just a weaker guarantee of being backed by one?

It’s a bit dealer’s choice. Combinators will not carry the FD over by
default (because the FD is also used for things like seeking which are
very much not preserved by the combinators), but you could write your
own combinators that do it.

For me personally it’s really only if the Out.t or In.t is directly backed by
the FD though.

1 Like