I recently ran into a situation with packed GADTs where technically TRMC is implementable, but I’m not sure if it’s doable with the features OCaml currently provides. Here’s a simplified example:
module List_with_length = struct
type ('a, 'length) t =
| [] : ('a, unit) t
| (::) : 'a * ('a, 'length) t -> ('a, unit * 'length) t
type 'a packed = T : ('a, _) t -> 'a packed [@@unboxed]
(* [of_list] can return any length, so we need to return a packed version.
This seems to mean that I cannot use TMRC.
*)
let rec of_list (list : 'a list) =
match list with
| [] -> T []
| hd :: tl ->
(match of_list tl with
| T tl -> T (hd :: tl))
;;
end
Because packed is unboxed, from a runtime, post type-erasure perspective, this is a classic case TRMC. But because of the packing, I have to match on the recursive call, which means the call is no longer in the tail position. Is this doable? If not should we look into adding support for this specific pattern into the compiler (treat a match on a recursive call where the result of the match is a single branch whose expression is TRMC constructible as a tail call)?
Have you tried ? Add the TRMC annotation and check the resulting assembly.
I worked on extensions to TRMC a while ago, but I never touched the implementation that is in the actual compiler. It might be that this could “just work” if the optimisation is late enough and that the proper transformations are made.
There is most likely no difference, but since your match has one branch you could also try let T tl = of_list tl
Yes I did try and it complains that it’s not a tail call (even if an annotation is added). And let doesn’t work either. I also looked at the generated lambda, and let and match compile down to the same code.
This is a known limitation of the current implementation of the TRMC transformation, which is purely syntactic: OCaml - The “Tail Modulo Constructor” program transformation. In particular, let-bound calls are not eligible to TRMC optimization.
I have a branch here where I made the simple change of transforming
let x = (f[@tailcall]) arg in
x
to
(f[@tailcall]) arg
only if there’s one use site. This is done at the lambda level right before the TRMC simplification, and since unboxed variants disappear in the lambda, this ends up being enough to achieve TRMC for packed GADTs.
However, this is a larger change since it changes the meaning of [@tailcall] to actually change the order of evaluation, but not unprecedented since [@tail_mod_cons] also affects the order of evaluation, and any code that this affects would have produced warnings about bad tailcall attributes before this change.
You can also imagine having an independent attribute [@defer] whose role is to defer the execution to the call site (which I think in this case could be useful in a lot of other scenarios like code organization), and which fails with a warning if there’s not exactly one use site, and then the solution for this would be to use [@defer].
If either of these options sound like a good idea, I’ll open a PR with a cleaner/correct implementation[1]. I can also just open a draft PR with the branch in the current state to move the conversation to the repo if that’s a better venue.
The proof of concept in the branch definitely breaks for certain scenarios like allowing let x = (foo[@tailcall]) bar to be defined at the top level, and will happily substitute that somewhere else. ↩︎
Not quite. It takes advantage of a situation where the evaluation order has been left unspecified. If the user has explicitly specified an evaluation order by using let-in constructs, then it is not affected by the transformation. (But then the transformation might fail, if the user has asked the recursive call to be performed first.)