I’m still not quite sure what you mean about multiple uses of run. Running two event loops in a single domain will never work (Lwt will raise an exception if you try, and Eio probably should too).
To run a new event loop in a new domain, you can spawn one using env
. That ensures that the new loop will be of the same type, and so the resources from one will work in the other. e.g.
let () =
Eio_main.run @@ fun env ->
Eio.Domain_manager.run env#domain_mgr (fun () ->
Eio.Flow.copy_string "Hello, from a new domain!\n" env#stdout
)
We need to check a bit to make sure everything is thread-safe. I’m a bit nervous about sharing FDs across domains because if one domain closes an FD as another it using it, there is a possible race where it might access an unrelated FD that just got opened with the same number. However, given that the stdlib already has a much worse version of this problem even within a single domain, we can probably live with it for now!
[ Example of implementing directory sandboxing using nested effect handlers ]
Interesting. I suspect this won’t quite work if you try to pass a directory from one fibre to another, though, since only the original fibre will have the appropriate handler in its stack.
That’s not to say that the capability approach is a bad design though. On the contrary, I think that it might combine very nicely with some of the work we’ve been doing at Jane Street on local allocations, allowing us to have a safe effect handlers without/before having a full blown effect system in the language. I’m very interested in exploring this direction.
I remember hearing about some local allocations stuff in the Signals and Threads podcast. That would be very useful! We have several APIs where we pass a slice of a buffer to a callback, and ask users not to continue using it after the callback returns. Would be great to ensure that statically.
It still requires users to understand class syntax to read the API…
odoc hides the class definition by default, which is useful here. Possibly it needs a comment saying “You don’t need to read this unless you’re implementing the API yourself” or something?
e.g. in odoc you see this:
class virtual source : object ... end
val read : #source -> Cstruct.t -> int
So the only things you really need to know are:
class foo = ...
can be treated astype foo
, and#source
can be treated as justsource
.
However, we do make use of row-polymorphism. For example, in addition to source
and sink
, we have two_way
, which includes both APIs. You have to know that the #
allows you to call read
on a socket, even though it looks at first like a separate type.
I would note though that Python has objects, methods and classes and is regularly recommended as a first language for children. I’m not saying this is a good thing (objects shouldn’t be first choice if something else will do), but it’s not an advanced topic.
and to decode the more difficult type errors from object types.
Using functions to access objects seems to avoid that problem. Here’s an example, where we try to write to stdin
instead of stdout
:
let () =
Eio_main.run @@ fun env ->
let dst = Eio.Stdenv.stdin env in
Eio.Flow.copy_string "Hello!\n" dst
^^^
Error: This expression has type Eio.Flow.source
but an expression was expected of type #Eio.Flow.sink
The first object type has no method copy
Taking a method call on every operation also seems like an unnecessary cost.
I did some benchmarking, comparing various schemes here GitHub - talex5/flow-tests: Just for testing. In particular, this compared Conduit 3 (using first-class modules and GADTs) with objects. The conclusion was that for accessing OS resources the speed of a method call hardly matters. In fact, Conduit 3 was slower, but for other reasons, I think. Here’s a simple benchmark with my own GADT version:
module Object = struct
class type source =
object
method read : bytes -> int -> int -> unit
method close : unit
end
let of_channel ch =
object (_ : source)
method read buf off len = really_input ch buf off len
method close = close_in ch
end
let read (source : #source) = source#read
end
module Gadt = struct
module type SOURCE = sig
type t
val read : t -> bytes -> int -> int -> unit
val close : t -> unit
end
type source = Source : (module SOURCE with type t = 'a) * 'a -> source
module Channel_source = struct
type t = in_channel
let read = really_input
let close = close_in
end
let of_channel ch =
Source ((module Channel_source), ch)
let read (Source ((module Source), source)) buf off len =
Source.read source buf off len
end
let time_object ch =
let source = Object.of_channel ch in
let buf = Bytes.create 4096 in
let t0 = Unix.gettimeofday () in
for _i = 1 to 1_000_000 do
Object.read source buf 0 4096
done;
let t1 = Unix.gettimeofday () in
Printf.printf "Time with object: %.3f\n" (t1 -. t0)
let time_gadt ch =
let source = Gadt.of_channel ch in
let buf = Bytes.create 4096 in
let t0 = Unix.gettimeofday () in
for _i = 1 to 1_000_000 do
Gadt.read source buf 0 4096
done;
let t1 = Unix.gettimeofday () in
Printf.printf "Time with module: %.3f\n" (t1 -. t0)
let () =
let zero = open_in "/dev/zero" in
time_gadt zero;
time_object zero
$ dune exec -- ./test.exe
Time with module: 0.243
Time with object: 0.243
This is for reading from /dev/zero
, which is very fast for the kernel. If you’re reading from a file or socket then obviously the kernel will be doing more work and the speed benefit (if any) will be smaller.
However, we also want to use sub-types, which is much easier with objects. As well as source, sink and two_way, some flows can be closed while others can’t. File sources should also allow pread, and sockets should allow send_msg. When using io_uring, you should be able to get the file descriptor, etc.