Sandmark nightly now monitors tail latency of sequential and parallel applications enabled by new features in OCaml 5.
Click to see the Sequential latency benchmark run
Click here to see the Parallel latency benchmark run
Instrumented runtime of the past
In the past, Sandmark used to support monitoring GC latencies using the instrumented runtime that was present in OCaml 4. But this GC latency feature was disabled due to breaking changes in Sandmark when moving from OCaml 4 to OCaml 5. It is also useful to note that the instrumented runtime wrote to a file, and had a noticeable impact on the program speed. As a result, this instrumentation had to be enabled with a compile-time flag that linked the instrumented runtime with the application rather than the default runtime. The instrumented runtime was used to generate the graphs that were used in the ICFP paper, Retrofitting Parallelism onto OCaml (Fig 10 and Fig 12). However, given its cost, the instrumented runtime was seen as only to be used by GC hackers for performance debugging.
Latency profiling through olly
OCaml 5 supports Runtime Events — a new feature that enables continuous monitoring of production applications. The key differences to the earlier instrumented runtime approach are
- Instead of a file, the events are now written to a shared in-memory ring. The events may be read out by an external process from this ring.
- Some of the frequent (expensive) probes associated are eschewed to keep the costs low. The expensive probes are still available using the instrumented runtime.
Due to this design, every OCaml 5 program may be continuously monitored for performance, not just the ones compiled with the instrumented runtime. On top of this runtime events feature, we have built olly, an observability tool for OCaml programs. Olly can extract traces of GC events that can be viewed by Perfetto and also produce a short report on GC behaviour including tail latency profiles.
The Sandmark team has now replaced the old latency profiling feature developed over OCaml 4 instrumented runtimes to using olly to generate the profiles. (See Sandmark PR here). Now, the OCaml compiler is continuously monitored not only for speed and memory usage, but also for latency.
Call for action
If you are interested in profiling and analysing the performance of the development branch of the OCaml compiler, please submit your branch through Sandmark Nightly Config.