Flamegraphs and recursive functions

Continuing the discussion from Opam list very slow on opam 2.5:

Thanks for the reminder, although I miss the recursive collapse feature of the original FlameGraphs (particularly useful for List.map, Set.add, etc.). Without it the different number of calls to Set.add would show up as separate columns.
As a proof of concept here is how to collapse recursive calls using perf’s flamegraph script:

--- flamegraph.py       2026-06-09 23:28:54.958156383 +0100
+++ /usr/lib/perf-core/scripts/python/flamegraph.py     2026-06-09 23:29:41.012998388 +0100
@@ -111,7 +111,8 @@
         node = self.find_or_create_node(self.stack, comm, libtype)
 
         if "callchain" in event:
-            for entry in reversed(event["callchain"]):
+            import itertools
+            for entry in reversed([k for k, _ in itertools.groupby(event["callchain"])]):
                 name = entry.get("sym", {}).get("name", "[unknown]")
                 libtype = self.get_libtype_from_dso(entry.get("dso"))
                 node = self.find_or_create_node(node, name, libtype)

Needs a CLI flag, and some documentation and then a patch could be submitted upstream.
Before:

After:

Yes sure, there are a lot of shortcomings to using this directly.
For instance, it does not work well if you spawn multiple OCaml processes from another OCaml program (if you spawn the same process many times, they’ll all appear distinct in the output, whereas you’d like them to be merged in most cases). It also does not handle very large trace very well. Fortunately, I realized that hotspot solve almost all of the issues (including collapsing recursive calls IIRC), so my advice would be to simply open the trace in hotspot.

I use Firefox Profiler when profiling Goblint, which is extremely recursive. Firefox Profiler doesn’t automatically collapse any recursion but it can be done on a per-function basis with lots of control. Furthermore, it has separate commands for collapsing any recursion of a given function, or only direct recursion. The latter is useful for the usual suspects like Set.fold to avoid mixing up distinct recursive calls in the same callstack.

Using perf record with --call-graph=dwarf is producing those [unknown] rows in the flame graph. Try re-running with a frame pointer version of the opam binary.