I tried an experiment, but this time with httpcats and miou.
From what I can see, lwt
still exceeds the request rate per second. I think this is mainly due to the fact that, even if Miou offers a domain pool, there are still synchronization mechanisms between the domains that lwt
does not implement.
The use of Lwt_unix.fork
(rather than Stdlib.Domain.spawn
) also avoids synchronization of the OCaml major heap between domains. Furthermore, the case of httpaf+lwt
is more like 32 executables (for 32 cores) acting as a web servers rather than a single executable executing OCaml tasks in parallel.
However, I have noted that httpcats
is better than what httpun+eio
can offer. Here is a summary table. This is the result of latency (the average) given by wrk
/tfb
:
|
httpaf+lwt |
httpun+eio |
httpcats |
8 clients, 8 threads |
19.05us |
327.14us |
32.54us |
512 clients, 32 threads |
8.5ms |
1.88ms |
1.07ms |
16 clients, 16 threads |
29.75us |
808.30us |
39.22us |
32 clients, 32 threads |
39.21us |
1.23ms |
64.83us |
64 clients, 32 threads |
425.44us |
1.26ms |
124.32us |
128 clients, 32 threads |
250.84us |
1.15ms |
263.56us |
256 clients, 32 threads |
2.51ms |
1.25ms |
471.59us |
512 clients, 32 threads (warmed) |
10.83ms |
1.89ms |
0.98ms |
Note that httpcats
supports client management more than httpaf+lwt
and httpun+eio
(latency is lowest when we have 512 clients). This may be due to the fact (compared to httpaf+lwt
) that Miou asks the system for events (such as the arrival of a new connection) more often than lwt
. In fact, lwt
tends to execute OCaml tasks further down the line rather than periodically requesting new events (so it will simply prioritize the management of an HTTP request rather than managing the arrival of a new connection).
This is the result of the number of requests per second (the average) given by wrk
/tfb
:
|
httpaf+lwt |
httpun+eio |
httpcats |
8 clients, 8 threads |
51.26k req/s |
25.37k req/s |
33.29k req/s |
512 clients, 32 threads |
45.65k req/s |
14.56k req/s |
16.65k req/s |
16 clients, 16 threads |
35.22k req/s |
13.83k req/s |
27.49k req/s |
32 clients, 32 threads |
25.44k req/s |
13.37k req/s |
16.6k req/s |
64 clients, 32 threads |
38.12k req/s |
12.08k req/s |
17.45k req/s |
128 clients, 32 threads |
41.31k req/s |
13.27k req/s |
18.1k req/s |
256 clients, 32 threads |
43.96k req/s |
14.03k req/s |
17.96k req/s |
512 clients, 32 threads (warmed) |
44.78k req/s |
14.37k req/s |
16.82k req/s |
As I said, lwt
outperforms the others, but you always have to keep in mind that the implementation consists of 32 programs (for 32 cores which don’t share the same GC) that manage all the requests, whereas in the case of httpun+eio
or httpcats
, it is indeed 32 domains (sharing the same major heap) and in which there are synchronization mechanisms (mutex and condition) in the OCaml runtime and in what eio
or miou
offer.
Furthermore, making an application where you would like to share a global resource between all the HTTP request handlers you spawned with Lwt_unix.fork
made might be more difficult than with httpun+eio
or httpcats
.
Finally, one last note is that httpcats
uses miou.unix
which uses Unix.select
— it is a fairly legitimate criticism to use something other than the latter as it has quite a few limitations (in particular on the number of file descriptors that can be managed) but it is also something that can easily be improved — at least, the design of Miou tends to be able to inject your own logic of system events such as the Solo5’s one for unikernels.
What I want to mention above all is that it seems to me that lwt
uses libev in your example and eio
uses io_uring. Despite Miou’s penalty (due to Unix.select
), the performances that httpcats
offer are still interesting
.
Finally, I would also like to mention that if you would like to go further with HTTP, we are currently developing vif: a small web framework based on httpcats
. EDIT: vif
is very experimental, even if we continue to develop it, don’t expect everything to work without a hitch!