Profiling with landmarks and Iter.t

Hi! I’ve been doing profiling with landmarks and it has worked fairly well, but my current codebase uses a particular monadic style that makes profiling quite difficult, and I wanted to gather ideas or information on how to make this easier.

Basically, most of the execution is spent in a 'a Iter.t monad (ie. ('a -> unit) -> unit), to model symbolic execution and non-determinism.

Furthermore, we make good use of wrapper functions, e.g. this simple wrapper for an effect for global state:

let with_crate (crate : t) f =
  let open Effect.Deep in
  try f () with effect Get_crate, k -> continue k crate

The combination of these two things makes profiling quite unpractical:

  1. Because of the wrapper functions, my entire profile is a dozen calls deep, making it hard to read (this is a minor problem). I was wondering if there are alternatives to these wrapper functions, or an option when profiling to omit these frames

  2. Because our monad uses smaller functions very heavily, and we make extensive use of let-binds to compose computation, the main monadic functions do not show up at all in the profile. For instance, I have an exec_stmt : stmt -> unit Iter.t function, that despite being called hundreds of thousands of times does not appear.
    I suspect this is because of the let-binds, or possibly because it is immediately wrapped in a function too:

      and exec_stmt stmt : unit t =
        let@ () = with_loc ~loc:stmt.span in
        match stmt.kind with
        | Nop -> ok ()
        | Assign (({ ty; _ } as place), rval) ->
            let* ptr = resolve_place place in
            let* v = eval_rvalue rval in
            State.store ptr ty v
        | ... 
    

    This is a small snippet of that function. Note we define let@ as ( @@ ), so this is effectively doing:

    with_loc ~loc:stmt.span @@ fun () ->
      match stmt.kind with | ...
    

    And similarly, let* is basically Iter.flat_map, so that’s just an additional closure.

If you have any recommendations on how profiling can be made easier with our current implementation please advise ! I am very happy to add ppx annotations or do small modifications that would make profiling more effective (as far as performance isn’t negatively affected).
I care a lot about performance and this project has been quite hard to benchmark :​P

Thanks !