Curious behaviour with domains

With Domainslib, I try:

let main () = 
  let pool = T.setup_pool ~num_domains:5 () in
  let t = Array.init 20 (fun i -> 
      T.async pool (fun _ ->
        print_string @@ Printf.sprintf "START %d\n" i;
        Thread.delay 2.0;
        print_string @@ Printf.sprintf "END %d\n" i)
  ) in
  Array.iter (T.await pool) t

The result is:

START 2
START 0
START 3
START 4
START 1
Fatal error: exception Stdlib.Effect.Unhandled(Domainslib__Task.Wait(_, _))

I have a similar result with the Thread.delay 2.0 replaced by:

ignore @@ Sys.command "sleep 2";

With ignore @@ read_line () it is similar too.

However, if I replace this delay by the sleepf proposed by domain-local-timeout:

let sleepf seconds =
  let t = Domain_local_await.prepare_for_await () in
  let cancel = Domain_local_timeout.set_timeoutf seconds t.release in
  try t.await ()
  with exn ->
    cancel ();
    raise exn

I have:

START 4
START 3
START 0
START 2
START 5
START 6
START 10
START 11
START 8
START 7
START 14
START 15
START 16
START 17
START 18
START 19
START 9
START 1
START 13
START 12
Fatal error: exception Stdlib.Effect.Unhandled(Domainslib__Task.Wait(_, _))

I guess domain-local-timeout works better with Domainslib and permits new threads to enter the pool when other are sleeping. But it seems that T.await pool is impatient and can’t support a pool where no threads are really working.

I don’t see the use of

val run : pool -> 'a task -> 'a

in your code. It’s documentation states: " This function should be used at the top level to enclose the calls to other functions that may await on promises. This includes await, parallel_for and its variants. Otherwise, those functions will raise Unhandled exception."

1 Like

OK, thanks. That’s work much better.

I note that Thread.delay 2.0 and ignore @@ Sys.command "sleep 2" both prevents a new thread to enter the pool. A read_line () has the same behaviour too.

I guess that using domainslib instead of Lwt would not permit an high load since any I/O would make one of the thread of the pool not used.

A sleepf 2.0 as described above permits new thread to enter the pool, but fails with the error:

Fatal error: exception Failure("Domain_local_timeout.set_timeoutf not implemented")

My understanding is that the domainslib API is intended for parallelism (running CPU-bound tasks in parallel) rather than concurrency (coordinating computation and synchronization between an unbounded number of tasks, including I/O). For this you should rather look at eio, or moonpool, Affect, Miou, etc.

In other words: if you want to replace Lwt, don’t use Domainslib, which is a lower-level building block.