I was surprised to observe with the following snippet, that the calls to MA.g () and MB.g () functions contains an indirection:
module type S = sig
val g : unit -> unit
end
module A : S = struct
let g () = Format.eprintf "A.g\n"
end
module B : S = struct
let g () = Format.eprintf "B.g\n"
end
module Make (X : S) : S = struct
include X
end
module MA = Make (A)
module MB = Make (B)
(* print_int are marks to ease the reading of assembly code *)
let () =
print_int 3;
A.g ();
print_int 4;
MA.g ();
print_int 5;
MA.g ();
print_int 6
By compiling with ocaml -S test.ml we can observe the assembly code (OCaml 4.14.0):
Using our markers above we see the code generated for A.g at the labels .L108 and .L109 which is the code that calls directly the function eprintf.
Again using our markers, we observe the code generated for MA.g () is at the labels .L111 and .112. And this time, we see an indirection since the call is done at the address contained in the register %rdi.
Is there a way to remove this indirection?
I would assume flambda does this (but for my use-case, I cannot rely on flambda)
I did not find any ppx that would do it
Using -O3 does not help either
I am also wondering why this inlining is not done by the OCaml compiler at the moment?
For technical reasons, the default inliner cannot inline functions that contains functions. That covers most functors. So the modules MA and MB in your example are more or less (typed) black boxes for the compiler.
It would be technically feasible to store enough information on functions (and functors) so that even if we’re not inlining them, we can propagate an approximation of the result, in your case allowing the calls to MA.g and MB.g to be inlined or at least translated to direct calls, but that’s a non-trivial amount of work and would likely increase the size of cmx files by a large factor.
As you guessed, flambda can indeed remove the indirection (it does not have the limitation on functions containing functions). I don’t know about ppxes, but -O3 is a flambda-specific option so it would not help if you’re not using flambda.
The trick is to use flambda. If you can’t, the only workaround I can see would be to try to use the local function optimisation ([@local] attribute), but I don’t think I’ve ever heard of anyone using it for functors and it comes with huge restrictions on the way you organize your code.
I just noticed that this feature has actually been there for a while already. The reason it doesn’t work on your example is that your functor doesn’t define any functions itself, so inlining is the only way to recover the correct function to call. But for usual functors like Map.Make or Set.Make, even if the functor is not inlined the functions themselves can be called without indirection.
In your case, if you want to benefit from this you need to put as much logic as possible in the body of the functor. For example:
module type S = sig
val g : unit -> unit
end
module type Arg = sig
val msg : string
end
module A : Arg = struct
let msg = "A.g\n"
end
module B : Arg = struct
let msg = "B.g\n"
end
module Make (X : Arg) : S = struct
let g () = Format.eprintf "%s" X.msg
end
module MA = Make (A)
module MB = Make (B)
let () =
print_int 4;
MA.g (); (* This should be a direct call or inlined *)
print_int 5;
MB.g (); (* This should be a direct call or inlined *)
print_int 6
If your actual code can be rewritten this way, then you should be able to get rid of the indirections. If not, flambda is probably your only solution.
I can point you at the relevant parts of the code, but I don’t think there is any documentation of that anywhere. As far as I could tell this code was already present in the first version of ocamlopt, so there isn’t even a PR I can link to.
These approximations are used in middle_end/closure/closure.ml to perform various optimizations, including inlining, direct calls, and more.
In the Value_closure case, the second field is the approximation of the return type of the function. It is computed in close_functions (here) and used when compiling Lapply terms (see here for the most relevant part).