Status of profiling in Ocaml?

I’m trying to profile an Ocaml program and not having much luck. gprof support is not longer present and it looks like we’re supposed to use ocamlcp and ocamloptp, but those don’t support ppx which I use in this project.

What is the suggested way to profile an Ocaml program?

See some references here:

Another interesting idea was Statistical Memory Profiling PR but sadly it’s completely out of sync and abandoned.

Tom Kelly wrote up some of his experiences profiling OCaml code a little while ago it’s probably a good current reference:

If you’ve got any specific questions, feel free to drop them in here. We’ve been doing a lot of profiling for the performance work on multicore.


you’re saying these tools don’t support preprocessing with PPX, right? If so, you could preprocess before invoking the ocaml tools. From reverse-engineering what dune does, I believe dune always preprocesses before invoking ocaml. So that’s one way to do it.

If you’re using ocamlfind, it isn’t very difficult to figure out the preprocessing commands, though I can understand why it would be a bit of a pain. So I wrote a little tool, ocamlfind2 (that merely trampolines over to ocamlfind for everything else) that supports a command preprocess. It takes all the arguments as ocamlfind ocamlc (at least, those that concern preprocessing) and runs the preprocessor. I wrote it so I could debug PPX rewriters and camlp5, but hey, it might be useful to you.

An example:

ocamlfind2 preprocess -package camlp5,compiler-libs.common,ounit2,fmt,pcre,rresult,compiler-libs.common,yojson,sexplib,ounit2,ppx_import,,ppx_deriving.eq,ppx_deriving.ord,ppx_deriving.enum,ppx_deriving.iter,,ppx_deriving.fold,ppx_deriving.make,ppx_deriving_yojson,ppx_here >

[my apologies for the lack of -any- documentation – the tool really is just something i wrote for my own needs, though I do plan to release it at some point, b/c this sort of thing is useful for sure.]

Nothing could be further from truth. @jhjourdan and @stedolan have been working like crazy to improve and merge this work in mainline OCaml. It will be part of release 4.11, which is already in alpha. Please, don’t spread misinformation.


Oh great!

My situation has changed a bit, in that I found the big obvious performance issue. But now I’m trying to figure out a memory leak that is impacting performance over time.

So the specific situation I’m in is I have an application I’ve written as part of a passion project and I’m doing some load testing on it and I’m seeing that with sustained load there is a slow build up of memory and reqs/s goes down. After running through spacetime I see a slow build up live blocks associated with a specific allocation point. The stack that corresponds to the most allocations is a bit hard to grok as a lot of it is through framework-esque functions. So I’m sill trying to figure out what exactly to do with the memory profiling data.

Thanks, I’ll look into your tool. Mostly I was trying to figure out some performance technique and felt a bit of frustration that the suggested tooling didn’t Just Work with, what seemed, like a pretty common use-case.

Well, ocamlfind2 preprocess is merely a way to invoke PPX rewriters – that’s all. And TBH, if you can use dune (I don’t, but lots of people do), that’ll do it for you automatically.

Last time I had to profile an OCaml program I used LexiFi’s landmark library which succeeded in identifying my bottlenecks.

AFAIR it was very easy to setup and gather data, something like add a ppx to your build and set an environment variable when running your executable. At the time the data viewer story was a bit suboptimal but that may have been fixed meanwhile (see

Also if you are on Linux and using native code then you can use perf to get a call graph profile. At a certain point it was also possible to use macOS native’s profiling (shark) but AFAIR I didn’t manage to get such profiles with the new “instruments” tooling which is the reason why I ended up using landmark.