[ANN] perf demangling of OCaml symbols (& a short introduction to perf)

As a project sponsored by the OCaml software foundation, I’ve worked on demangling OCaml symbols in perf. Some screenshots are below. The work is currently being upstreamed. In the meantime, it can be used as follows:

git clone --depth=1 https://github.com/copy/linux.git
# or:
# wget https://github.com/copy/linux/archive/master.tar.gz && tar xfv master.tar.gz
cd linux/tools/perf
make
alias perf=$PWD/perf
# or copy perf to somewhere in your PATH

Your distribution’s version of perf will also work for the examples below, but will have less readable symbols :slight_smile:

Short intruction to perf

Perf is a Linux-only sampling profiler (and more), which can be used to analyse the performance profile of OCaml and other executables. When compiling with ocamlopt, add -g to include debug information in the executable. dune does this automatically, even in the release profile. To start a program and record its profile:

perf record --call-graph dwarf program.exe

Or record a running program:

perf record --call-graph dwarf -p `pidof program.exe`

Then, view a profile using:

perf report # top-down
perf report --no-children # bottom-up

Within the report view, the following keybindings are useful:

  • +: open/close one callchain level
  • e: open/close entire callchain
  • t: Toggle beween current thread and all threads (e.g., only dune, ocamlopt, etc.)

Or generate a flamegraph:

git clone https://github.com/brendangregg/FlameGraph
cd FlameGraph
perf script -i path/to/perf.data | ./stackcollapse-perf.pl | ./flamegraph.pl > perf-flamegraph.svg

You may need to run the following command to allow recording by non-root users (more infos):

echo 0 | sudo tee /proc/sys/kernel/perf_event_paranoid

Sources

Before:

After:

Bottom-up:

Flamegraph (cropped):

34 Likes

Thanks!

Out of curiosity, I went to find the patch proposed upstream. I located the following:

https://lore.kernel.org/linux-perf-users/20210203211537.b25ytjb6dq5jfbwx@nyu/T/#u

(It looks like the patchset has not received comments yet. Is it something that we can/should help with?)

I looked at the patch and I have two comments, the first one minor:

  • There are no string-size check before the sym[caml_prefix_len] access in ocaml_is_mangled, so with the input “caml” we are reading the terminating null byte here. This is safe and correct, but I found it a bit surprising. (Maybe this sentinel-like usage is familiar to kernel programmers, or just people who actually write C code for a living.)

  • The final part of the demangling removes the _[0-9]* suffix of the symbol, that corresponds to a unique symbol identifier generated by the OCaml compiler backend.

I am not sure that removing this identifier is the right thing to do; indeed there are cases where two symbols only differ by their identifier, consider the following pattern:

let foo input =
  let rec loop =
    ...
  in loop input

let bar input =
  let rec loop =
   ...
  in loop input

Currently the only way I know to distinguish the two loop functions when they show up as a symbol is to call (ocamlc -dlambda) or (ocamlopt -dcmm) on the module, look at the compiler-produced identifiers (say camlMod_loop_2245 and camlMod_loop_2257), and remember which is which.

In some cases we can recover the information from more context: maybe the symbol comes with source debug location, or it shows as part of a backtrace where other entries clarify which function it is (note: in the example above both loop are called in tail position from their outer function, which will not show in the stack). But I wonder if there are still some cases where we could want the information, which might suggest demangling camlMod__loop_2245 into, say, Mod.loop(2245) rather than just Mod.loop as the current patch does.

I wonder if other people with more experience looking at perf output have an opinion here (maybe one of @lpw25, @stedolan, @mshinwell?).

1 Like

I haven’t had time to investigate yet, but I am also unsure about removing the integer suffixes. I often look at code in perf where there are multiple functions that have the same name up to the suffix due to (cross-module) inlining. For the utility of these low-level tools, I think that it is necessary for the demangling to preserve uniqueness of names.

I think one thing that is important to understand is that OCaml performs very little mangling. The only part which, to me, can be defined as mangling is that symbols are built by appending a caml prefix, the name of the compilation unit (basically the module being defined), then two uderscores (__), then some name that needs to be unique. This is also extended to packed modules, so that the compilation unit for the file foo.ml that will be packed inside Bar is Bar__Foo.
(Because of this, a number of people use the double underscore in module names to trick the programmers and tools into believing that one module is actually a submodule of an other one (the most common example is dune’s wrapped libraries). I don’t consider this as mangling, as it’s not done by the compiler.)

However, within a given compilation unit, functions and other statically allocated values will get names that can be completely arbitrary, although a decent amount of work is done to make it possible to find out which part of the code it comes from. In particular, it’s perfectly possible to build examples where a file foo.ml defines a function bar, and the resulting object contains a symbol named camlFoo__bar_nn (and no other one with a similar name) but the symbol is not the one corresponding to Foo.bar.
Here is such an example:

(* functor.ml *)
module type S = sig val c : int end
module F (M : S) : sig val f : int -> int end =
  let bar x = x + M.c
  let f = bar
end

(* foo.ml *)
module M = struct let c = 0 end
include Functor.F (M)
let add x y = x + y
let bar = add

If you compile this program with flambda, and look at the generated code (-dcmm is a good way to look at it) then you should see that the output contains a function camlFoo__bar_nn which takes a single argument and returns it, while you would expect Foo.bar to take two arguments and return their sum.

I still think it’s a very good idea to improve the tools, and this proposal contains some nice ideas (I particularly like the fact that the locations encoded into the names of anonymous functions are printed in a legible way), but I’d caution against going too far and making too many assumptions.

You could contribute by testing or reviewing the code and responding on the mailing list with a Tested-by or Reviewed-by tag. I’m not sure how much weight is given to first-time reviewers, but it certainly wouldn’t harm. See Submitting patches: the essential guide to getting your code into the kernel — The Linux Kernel documentation

The existing Rust demangler makes similar assumption, so I opted to write the OCaml demangler in the same style. I’m not sure about the strncmp case, but sym[i] == '_' && sym[i + 1] == '_' is certainly common in C codebases I’ve worked on.

This is a good point. I was under the impression that such cases were mostly unambiguous, but it seems to be more common than I thought. I’ll post a v2 of the patch on the mailing list that preserves the trailing identifier id.

I’m not sure about the (2245) pattern, rather than just leaving _2245 in place. It makes it harder to correlate with cmm or disassembly output, e.g. when using grep. It also breaks the property that the demangled symbol is strictly smaller than the mangled symbol (or makes it harder to verify this property).

Replacing __ by . is indeed a bit of a heuristic, but I think it results in more readable names in the end. Even in your example, the include Functor.F (M) line causes bar from functor.ml to be visible in the scope of the module Foo (and is later shadowed). Foo.bar is not a terrible name, especially in the presence (cross-module) inlining. An interesting questions is whether the compiler could generate symbol names in such situations, e.g. by appending [inlined from …], but unrelated to these changes.

To clarify, the patch currently modifies symbols generated by the OCaml compiler as follows:

  1. Remove the caml prefix
  2. Replace __ by .
  3. Unescape $xx sequences, e.g. $2b+
  4. Remove trailing _\d+ (will be removed in the next version)

I think you’re missing the point. Foo.bar is a terrible name because the Foo module has a bar field, so naming anything else Foo.bar is misleading. Besides, you may not have noticed but bar is explicitly not exported in the signature of Functor.F, so the include doesn’t bring bar into the scope of Foo. Finally, this example is not completely hypothetical, if you replace bar with compare this is a relatively common occurrence.

There has been some work recently to improve the printing of backtraces with better contextual information, I think this could be reused for generating better symbol names too. Maybe I’ll give it a try some day.

As @gasche and @jjb have pointed out, there are even simpler cases where Foo.bar does not refer to the toplevel bar in Foo. I think another opinion on this would be helpful: Should OCaml symbol demangling replace __ by .? It produces correct names in the simplest case of a toplevel function, but may produce misleading names in case of nested functions and inner modules, e.g. all foo below will be named Test.foo_xxx (with different xxx suffixes, e.g. Test.foo_1234). Otherwise, their names will be Test__foo_xxx.

let foo x y = x + y

module Bar = struct
  let foo x y = x + y
end

let bar () =
  let foo x y = x + y in
  ()

Quick update: The patches have successfully been upstreamed. The original patch (which removes the identifier id) has already been released in 5.12, while the second patch (which preserves the identifier id) is in current master and should make it into the next release.

5 Likes

I also see that Stdlib’s inner modules are not capitalized, I guess that’s the __ naming trick but it shows that one can’t be too sure if a name after two underscores corresponds to a function or a module (is it a good heuristic to say anything that looks like __this__ is likely a module named This?), and your example shows that scope info is destroyed in symbol generation… I wonder why the symbol naming process has been made one-directional like that?

I also see that Stdlib’s inner modules are not capitalized, I guess that’s the __ naming trick but it shows that one can’t be too sure if a name after two underscores corresponds to a function or a module

This is due to a hack in the build process specific to the stdlib. It has already been addressed upstream (Change the Standard Library prefixing to match Dune's by dra27 · Pull Request #10169 · ocaml/ocaml · GitHub). As far as I know, module names generated by ocamlopt (and dune, ocamlbuild, …) have never been lower-case.

(is it a good heuristic to say anything that looks like __this__ is likely a module named This?)

Yes, that is the convention used by ocamlopt: Symbol bar in foo.ml is named camlFoo__bar. dune and the stdlib build system take this idea further and rename foo.ml in a library to lib__foo.ml, so that the symbol becomes camlLib__Foo__bar.

and your example shows that scope info is destroyed in symbol generation… I wonder why the symbol naming process has been made one-directional like that?

I agree that it would be nice to improve the current naming scheme :slight_smile: . The main purpose of this change was to make perf aware of the current mangling scheme. It should be compatible with further improvements in the upstream generation of names, such as including submodules in the name.

Hi @copy, I just saw this topic and was wondering if the instruction from your initial post are still up-to-date.

If I install perf normally will it includes your modifications ?

If I install perf normally will it includes your modifications ?

Yes, an up-to-date distribution will very likely include perf with my changes.

2 Likes