I have an app design question regarding the use of multiple domains in say Eio or some other multicore library. We can do eg
let num_domains = Domain.recommended_domain_count () - 1
let additional_domains = Eio.Stdenv.domain_manager env, num_domains
...
Cohttp_eio.Server.run ~additional_domains ...
…to make the server run on multiple domains (I assume that’s how it works).
But suppose I have another part of my app that uses the Eio.Executor_pool to run background jobs on multiple cores. And I also need to pass it similar arguments:
So do these different functions coordinate to ensure that they don’t create num_domains + num_domains domains? Ie, do they check the existing number of domains before creating new ones? Or is that check guaranteed by Eio at a lower level?
Unfortunately the answer is no. The implementation of the Cohttp_eio.Server predates Eio.Executor_pool. This is why it uses the raw domain spawning API rather than using a pool. In fact, it is Eio.Net.run_server that uses the Domain_manager.run API – I think there’s a good argument here to change this to using a pool (perhaps an issue for this would be good).
As a workaround, you can wrap the existing domain manager implementation to do something more clever (e.g. use a pool or something like that maybe).
Got it. I think two tracking issues will be needed as two APIs will change:
EDIT: I also want to say that this analysis was fairly trivial thanks to Eio’s design exposing functions which take the domain manager. I think this is a win for the capabilities-based design.
Although the Eio.Executor_pool.submit function seems to be designed for only one subsystem to have access to the pool and submit coarse-grained jobs to it, not for multiple decoupled subsystems. I can’t see a way for both an HTTP server and an async message queue (eg) subsystem to share the same pool.
OK, I’ve suggested a multi-core runtime strategy, basically:
let () = Par.run @@ env ->
...run on multiple cores...
(* Distribute slices of the array across all worker domains *)
let result = Par.sum env large_float_array in
...
So this takes care of starting up the recommended number of domains and running the app across all of them, while also setting up a way to submit parallelized (ie CPU-intensive) tasks and getting a promise of the result.
This is a POC right now (linked above) but I believe this is a good direction: users don’t need to worry about setting up domains, they don’t need to hand over all of the domains to a specific subsystem like the HTTP server, they don’t need to figure out how many domains to allocate for what.
Of course, this is not thoroughly tested or benchmarked right now; more to come. But happy to discuss more.
Sorry to interfere with this question, but I’d just like to mention that this is exactly what Miou offers with Miou.parallel:
let () = Miou.run @@ fun () ->
let result = Miou.parallel sum large_float_array in
...
You can see the documentation here (with a little example): Miou.parallel. And about an HTTP server, httpcats is released and actually it follows the pattern you suggest.
There’s an interesting discussion going on here regarding domains vs threads as building blocks and the merits of user-defined concurrency. Admins, can we split this discussion thread up?