Hi, I wrote a ppx that essentially eta-expands certain function definitions and function calls, this is to be able to write CPS code that looks like certain parameters are abstracted over but is compiled more efficiently (btw if someone knows a better way of doing that than ppx I’m all ears). This is a very simple and naïve thing for use in one specific project, no attempt at mangling names or anything. Anyway, I really struggled with how you use Ast_pattern non-trivially. For expressions, I looked at how ppx_let is doing it and it basically only uses single_expr_payload and then matches on the actual constructor directly (so I also did that). For structure items, there is no equivalent function (or is there?) and I didn’t really figure out how to handle several cases for the same context.
Let’s say I want to handle let-bindings and top-level expressions. I figured out (with some difficulty) that you can extract them as follows (is this even correct btw?):
let extract_strvalue =
Ast_pattern.(pack2 (pstr ((pstr_value __ __) ^:: nil)))
let extract_streval =
Ast_pattern.(pack2 (pstr ((pstr_eval __ __) ^:: nil)))
I know how to write the corresponding expansion function, and then do one of:
let strvalue_ext =
Extension.(declare "myext" Context.structure_item extract_strvalue expand_strvalue)
(* or *)
let streval_ext =
Extension.(declare "myext" Context.structure_item extract_streval expand_streval)
but crucially not both at the same time (and I want them to have the same name because they morally do the same thing). I can’t figure out how to do it as a single pattern given that the types are different, other than do the entire transformation inside the pattern and have the expansion function be trivial (but surely that is not the intended use of the API?), or wrap the extracted data in a polymorphic variant (but then you have to match in the expansion function, surely that defeats the purpose of Ast_pattern?). So I guess my question is, how are you supposed to handle different cases of structure items where you want to keep all the data and do something different in each case? Or alternatively, is there a way to just access a structure item directly and just match on the descriptor yourself, the same way you can do it for expressions?
(Note: this is mostly a theoretical question, as in practice I don’t need to support top-level expressions, I had only tried to because it felt more complete. It’s just that at this point I’m really puzzled over the intended way of using of Ast_pattern.)
Well my extension adds parameters to function definition. At the expression level it is mostly intended to be used on let-bindings, and it turns:
let%myext somefunction a b = ...
into
let somefunction a b extra1 extra2 = ...
but it also works on anonymous functions (tacking on the parameters at the end), and finally on other arbitrary expressions where it turns [%myext <expr>] into fun param1 param2 -> <expr>. With respect to structure items, I thought it made sense to apply it to let-bindings in the same way as for expression-level let-bindings, and to top-level expressions where it would do the same stuff it does to expressions (I initially tried to do this for the sake of completeness; after further thought there is no practical use for it so this is why I say my question is theoretical). I was able to write expansion code that does all of this but I hit an error when I tried to register the structure-level binding extension and the structure-level expression extension under the same name, and what I’m asking is what kind of code you are expected to write in such cases. All the ways I could think of make either the pattern or the expander trivial.
I see, indeed you’re only supposed to register an extension once per context. In your case, the rule to rewrite Pstr_eval and Pstr_value is the same (same name, same context: structure_item) and needs to be registered once and deal with both cases on its own.
Depending on what you want to do with those you might be able to write a Ast*_pattern that works for both but I would not necessarily advise it.
Ast_pattern has two main advantages:
It will report errors for you if the extension payload is ill-formed, describing the expected payload
It is more future proof than pattern-matching over the types directly as its API is stable, the AST types aren’t (though we’re about to change that slightly to bring a bit of stability to the ppx ecosystem).
Ultimately how you split between Ast_pattern and manual handling of the AST is up to you but my personal advice would be the following: Use Ast_pattern to narrow down the form of the payload, in you example that would be a single structure item and let the expansion function do the rest, that is, match over the structure_item_desc to deal with the Pstr_value and Pstr_eval cases separately.
This makes sense. I guess I happened to want to write something for which Ast_pattern is not very useful (broadly applicable transformation where you don’t match very deeply and everything the programmer actually wrote ends up in the final code). I ended up adding some other functionality to my ppx that is a bit more “magic”, with parts of what the user writes that are substantially reinterpreted, and in this case the benefits are clearer, actually writing out the match statement would have been very ugly (what I wanted was a let binding where the pattern is a tuple of constant strings).
One problem I still had for my second application is that when you’re reconstructing an AST close to the original, you lose the locations for intermediate nodes, and to keep them you need to pepper your pattern with calls to map0' and its friends which are a bit difficult to figure out. I might propose a PR with extra combinators (e.g. a shortcut for map0' ~f:Fun.id called with_loc or something).
let%def somefun a b = do_something (in something_else)
(* becomes *)
let somefun a b extra1 extra2 = do_something (in something_else)
fun%def a b -> do_something
(* becomes *)
fun a b extra1 extra2 -> do_something
call' somefun a b
(* becomes -- this is the "special function" feature of ppxlib *)
somefun a b extra1 extra2
let%set () = anotherfun x y in do_something
(* becomes *)
let extra1, extra2 = anotherfun x y in do_something
The point is that I am writing something in continuation-passing style where the type of continuations ends in:
type 'a cont = foo -> bar -> 'a
Most of my functions do not and should not do anything with the final parameters of continuations and just thread them through (this is for parser combinators and the final parameters contain the parser position etc.), so I would like to write things as if 'a cont was abstract. This makes the code nicer and allows me to change the type without rewriting everything. However, it seems that this has a big performance impact because it prevents the compiler from inlining stuff (I read this on here but confirmed it in experiments). So I use my ppx to turn eta-reduced code into eta-expanded code. Perhaps this is over-engineered but it was fun to design. I did not attempt to check parameters have the right scopes, I just always use the same (mangled) names and trust the programmer to use call' and %def in all the right places.
The %set expansion lets me call helpers that do modify the parameters while hiding them from my code. At some point I also implemented this:
let%set "extra1" = something in some_other_thing
(* becomes *)
let mangled_extra1 (* same name used by call' *) = something in some_other_thing
but decided it was poor design, those few functions that do something with the parameters should be written out in full. This is what I am talking about above (let-binding with constant string or tuple thereof pattern).
So here is some example of use (where the phantom parameter is just a list of log messages):
type 'a cont = string list -> 'a
let push_msg msg messages = msg :: messages
let%def my_if_test (on_true : 'a cont) (on_false : 'a cont) scrutinee =
if scrutinee then
call' on_true
else
let%set () = call' push_msg "test returned false" in
call' on_false