OCaml web server run multiple processes

I’ve been running some simple benchmarks to compare web servers written in JS, F# and OCaml:
environments:
JS

I started with @shonfeder’s excellent tutorial: https://shonfeder.gitlab.io/ocaml_webapp/
Seriously if there were more tutorials like this the OCaml community would have been times bigger, they are so helpful when starting out and you don’t want to solve the same problem everyone else has already solved.
And then implemented parts of it in JS and F#

the tests:
1st just renders html, with no DB calls whatsoever
2nd:

  • queries a DB
  • adds an additional item to the result
  • orders the list
  • returns the result as JSON

I ran each test 3 times and posted the best result

This is not a scientific test using a controlled environment, this is a quick and dirty test, so take it with a bit of salt (but in many ways the results are not so unexpected). The tests are also not comparing the languages themselves but a combination of some of the more popular libraries of them, which is how you would normally use them.

EDIT: adding a link to the code: https://gitlab.com/mudrz/ocaml-web-benchmarks/-/tree/master

The tests were made with wrk -t8 -c400 -d30s - 8 threads, 400 connections, 30 seconds;

  HTML endpoint, no DB access

JS - single process

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    44.16ms    4.73ms  59.22ms   88.44%
    Req/Sec     1.10k    89.00     1.50k    82.83%
  262651 requests in 30.05s, 179.60MB read
  Socket errors: connect 0, read 264, write 0, timeout 0
Requests/sec:   8739.98
Transfer/sec:      5.98MB

JS - multiple processes

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    43.37ms   56.91ms 316.27ms   83.49%
    Req/Sec     2.43k   459.80     3.81k    73.25%
  580742 requests in 30.05s, 397.10MB read
  Socket errors: connect 0, read 239, write 0, timeout 0
Requests/sec:  19325.95
Transfer/sec:     13.21MB

OCaml - single process

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    15.01ms    2.28ms  39.86ms   89.00%
    Req/Sec     3.27k   240.86     4.53k    89.83%
  781313 requests in 30.01s, 489.54MB read
  Socket errors: connect 0, read 308, write 0, timeout 0
Requests/sec:  26031.65
Transfer/sec:     16.31MB

F#

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.46ms  840.10us  20.34ms   83.20%
    Req/Sec     8.91k   659.89    13.27k    73.25%
  2130565 requests in 30.04s, 1.45GB read
  Socket errors: connect 0, read 272, write 0, timeout 0
Requests/sec:  70923.15
Transfer/sec:     49.31MB

Results:

  • OCaml: 1.34x more requests/s than JS, with 2.8x less latency
  • F#: 3.66x more than JS, 2.7x more requests/s than OCaml, with 2.7x less latency

  JSON endpoint, DB access

The JSON response for JS and OCaml was:

{"excerpts":[{"author":"kan","excerpt":"Another excerpt","source":"My source2","page":"another page"},{"author":"kan","excerpt":"My excerpt","source":"my source","page":"23"}]}

for F# it was slightly longer since option types are serialized by default with a Some/None variant (it can be changed):

{"excerpts":[{"author":"kan","excerpt":"Another excerpt","source":"My source2","page":{"case":"Some","fields":["another page"]}},{"author":"kan","excerpt":"My excerpt","source":"my source","page":{"case":"Some","fields":["23"]}},{"author":"a","excerpt":"b","source":"c","page":{"case":"Some","fields":["d"]}}]}

The DB was Postgres with 10 max connections

JS - single process

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    57.79ms    5.68ms  88.02ms   85.24%
    Req/Sec   848.20    109.95     1.37k    72.24%
  202885 requests in 30.09s, 62.69MB read
  Socket errors: connect 0, read 237, write 0, timeout 0
Requests/sec:   6742.09
Transfer/sec:      2.08MB

JS - multiple processes

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    57.48ms   61.84ms 774.03ms   83.36%
    Req/Sec     1.20k   260.84     2.29k    71.38%
  287101 requests in 30.04s, 88.71MB read
  Socket errors: connect 0, read 286, write 38, timeout 0
Requests/sec:   9558.04
Transfer/sec:      2.95MB

OCaml

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     6.69ms   38.78ms   1.07s    98.17%
    Req/Sec     1.78k   842.50     3.62k    56.33%
  424454 requests in 30.02s, 100.39MB read
  Socket errors: connect 0, read 253, write 0, timeout 13
Requests/sec:  14139.42
Transfer/sec:      3.34MB

F#

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    19.27ms    3.92ms 107.53ms   82.60%
    Req/Sec     2.54k   165.82     3.26k    79.21%
  606868 requests in 30.02s, 261.02MB read
  Socket errors: connect 0, read 259, write 0, timeout 0
Requests/sec:  20214.71
Transfer/sec:      8.69MB

Results:

  • OCaml: 1.48x more requests/s than JS (up from 1.34x before), with 8.6x less latency (before: 2.7x)
  • F#: 2.1x more than JS (down from 3.66x before), 1.43x more requests/s than OCaml (down from 2.7x before), with 2.88x MORE latency than OCaml (before: 2.7x LESS than OCaml)

Observations:

  • JS is performing unexpectedly good compared to compiled languages
  • F# (or ASP.NET Core) is really fast out of the box, with no tweaking necessary
  • OCaml is running on a single process and has had Max request time of 1.07s and Stdev 10x that of F#; in some tests it spiked to 2seconds for some requests, is this the GC? how can I troubleshoot that?

Is there a good tutorial on running OCaml with multiple processes and generally commonly faced use cases for web servers?
There are countless articles for the other ecosystems, but it is a bit difficult to find ones for OCaml, making it a bit time consuming to try to figure each thing out

7 Likes

This is cool! I would definitely be interested in seeing the code.

I just run multiple processes and load-balance between them with nginx (with everything running in docker).

3 Likes

Yup, this is also common practice in the Node.js, Ruby, etc. community. I personally have have done it in two different projects–instant capacity boost and better node utilization.

@roddy
added the code to:

there are 3 dirs for each server


running separate processes sounds good, how do you handle shared state? shared redis?
(for example user session)

Is there a sample open source project you could point me to?

@yawaramin

I tried adding re-web, but am not entirely sure how to since it seems to be using esy

1 Like

ideally OCaml should just be added to techempower, but perhaps it makes sense to first see how to optimise the code and check the multi core support

3 Likes

also not sure if I’m doing something wrong, but the OCaml dependencies take ages to install
Is this because they are also being compiled?

1 Like

Yeah that’s a common way to do it. Also databases or cookies.

Sorry I don’t understand what you tried to do. ReWeb comes with an example server, you can run it and then wrk http://127.0.0.1:8080/hello to do a simple hello-world benchmark.

Yeah. Opam’s model is that most projects share the same opam switch, where packages are compiled once and ‘globally’ installed.

1 Like

I found it generally interesting to compare OCaml and F# syntaxes and how things work, so adding this if others find it interesting:

HTML templates

OCaml (tyxml):
each element is a function that has labeled arguments for attributes and a final argument for the children

  head
    (title (txt "OCaml Webapp Tutorial"))
    [ meta ~a:[a_charset "UTF-8"] ()
    ; link ~rel:[`Stylesheet] ~href:"/static/style.css" () ]

F# (giraffe viewing engine, there were others, but I haven’t tested them):
each element is a function that accepts 2 lists - the first one for attributes, the second for children

        head [] [
            title [] [ txt "Fsharp Bench" ]
            meta [ _charset "UTF-8" ]
            link [ _rel  "stylesheet"
                   _type "text/css"
                   _href "/style.css" ]
        ]

DB Query

OCaml (caqti rapper):

let get_excerpts_by_author conn =
  let open Excerpt in
  let sql = [%rapper get_many {sql|
      SELECT @string{author}, @string{excerpt}, @string{source}, @string?{page}
      FROM excerpts
      WHERE author = %string{author}
  |sql} record_out] ~author:"kan" in
  query_pool sql conn

F# (dapper):

let get_excerpts_by_authors conn = 
   let sql = """
      SELECT author, excerpt, source, page
      FROM excerpts
      WHERE author = @author;
   """
   let! data = conn.QueryAsync<Excerpt.t>(sql,  dict ["author" => "kan"])

Serialization

OCaml:
define as serialisable:

type t =
  { author: string
  ; excerpt: string
  ; source: string
  ; page: string option
  }[@@deriving yojson]

and then call the generated function:

let json = Excerpt.to_yojson excerpt in

F#:
mark type as “cli mutable”:

[<CLIMutable>]
type t =
  { author: string
  ; excerpt: string
  ; source: string
  ; page: string option
  }

serialisation happens automatically

async:
OCaml (lwt):
“await” with let* and let+ and then return with Lwt.return

let get_authors req =
    let open Lwt.Syntax in
    let* authors = Db.authors req in
    Lwt.return authors

F#:

let get_authors req = task {
    let! authors = Db. authors req
    return authors
}
4 Likes

I wanted to add re-web and morph as dependencies of the existing project, but with one of the packages - I couldn’t add the package, with the other there was a dependency on OCaml < something , which was conflicting;

how can it be added if it is not on opam?

Thanks for those benchmarks and those snippets, interesting comparison. Also I agree that TechEmpower benchmarks would be useful.

In addition to what @yawaramin and @roddy said, you could use a job queue for CPU heavy or long-running stuff where you have many worker processes.

Yes, either redis/equivalent or just your database. It’s not open source yet but I’ve been fine with the latter so far in my project, although my auth method doesn’t require shared state which I think is the big case where redis is useful.

Oh I see. ReWeb and Morph use packages that are not published on opam but are on the npm registry, so it’s not possible to add them to purely opam projects. It’s simple to add them to Esy projects though (of course accounting for version conflicts).

You can add unpublished packages to your opam projects too. Just add a pin-depends field in your project to indicate that the given url/hash should be pinned before installation is attempted.

2 Likes

@avsm this would be great, I tried following the link and this: https://opam.ocaml.org/blog/opam-20-tips/#Pinnings

I tried adding re-web to the list of depends:

depends: [
  "re-web" {dev}
]

and pin-depends:

pin-depends: [
  [ "re-web" "https://github.com/yawaramin/re-web#master" ]
]

it failed with error 3: File format error in 'pin-depends' at line 46, column 11: while expecting package: OpamPackage.of_string
once I added some fake version the parsing worked:

pin-depends: [
  [ "re-web.0" "https://github.com/yawaramin/re-web#master" ]
]

but I got:

[ERROR] Error getting source from https://github.com/yawaramin/re-web#master:
          - Unknown archive type: /private/var/folders/jk/9t2x5zxd1255nbvzn57nw_pc0000gn/T/opam-3104-6d07ae/re-web

I got the same error when trying:

opam pin add re-web https://github.com/yawaramin/re-web#master --dev-repo

Does the package to be pinned need something else?

Hi, in this case it won’t work, because of the reason I mentioned earlier and also because the re-web.opam file is empty. So opam won’t understand it.

I mentioned earlier that ReWeb depends on packages that are published in the npm registry only, not in opam. It’s possible to install some of them with opam pin, but that means you would need to manage their versions and upgrades yourself at the commit level for each project. With the npm-published versions, I am using version bounds so I don’t need to pin to specific commits.

The reason the re-web.opam file is empty is because the project is managed in Esy and it would be a duplication of effort to maintain the info there that Esy requires in the existing package.json. But a re-web.opam file itself is required because otherwise Dune doesn’t work.

it would have been nice if esy unreleased projects are compatible with opam;

anyways, from the results above, OCaml had:

Max request time of 1.07s and Stdev 10x that of F#; in some tests it spiked to 2seconds for some requests, is this the GC? how can I troubleshoot that?

can someone please direct me to how I can troubleshoot this?