When building with dune, is there a way to enforce the order of ppx rewrites?
For example, if I have a ppx that contains a type definition, and there’s a deriving clause attached to the type definition, I want to be sure the containing ppx is applied, and next the deriver is applied.
Dune simply link the various ppx rewriters in the order they are specified in the dune file. It’s up to the underlying driver mechanism to decide whether to take this order into account or not.
They are currently two drivers out there:
the ocaml-migrate-parsetree one
the ppxlib one
The ppxlib one is built on top of the ocaml-migrate-parsetree one, and in particular all transformations registered with ppxlib appear as a single transformation registered with ocaml-migrate-parsetree.
ocaml-migrate-parsetree only accepts whole file transofmations, i.e. ast -> ast functions. These are ordered by the version of the OCaml AST they use in order to minimise the number of Ast upgrade/downgrade. It is however possible to attach a priority to a particular transformation to force its position in the pipeline.
ppxlib accepts two kinds of transformations, whole file ones just like ocaml-migrate-parsetree and more high-level and well defined rule that can be merged together. A typical rule is to rewrite a particular extension point. All rules are merged together and applied in a single pass, which ensures both good performances and good semantic. Indeed, the output of individual rules is rewritten recursively, which ensures that if the expansion of a particular extension point produces more extension points or even [@@deriving] attributes, these are properly expanded no matter the order in which the ppx rewriters were specified in the dune file.
In general, trying to reason about the order of whole file transformations is tedious. For authors of ppx rewriters, you don’t know what other ppx rewriters the user is going to use in combination with your own ppx rewriter and how they will all interact with each others. For users of ppx rewriters, they usually don’t know the low-level implementation details of each ppx rewriters and how they should be ordered. That’s why in ppxlib we made the design choice that the overall rewriting is always completely independent of the order in which the ppx rewriters are specified.
In conclusion, in today’s world you cannot reliably enforce a particular order. I would suggest to describe your use case more in detail so that we can see how it can fit in the current world or how we can extend the world to make it work.
ppx1 : traverses the AST, and makes sure that invocations of ppx2 do not occur in particular syntactic positions, and that “deriving bin_io” and another deriving target do not occur anywhere in the original AST; fail if these properties do not hold
ppx2 : rewrites some types to include “deriving bin_io” and the other deriving target
ppx3 : rewrites based on the other deriving target
So if ppx2 were applied before ppx1, for example, compilation will fail unnecessarily.
I’m not sure what additional detail you’re seeking, beyond what I wrote above.
ppx1 does not change the AST in any way, it’s just there to enforce syntactic restrictions.
ppx2, besides adding “deriving bin_io” and another deriving target, also generates some functor invocation boilerplate.
ppx3 processes the other deriving target, generating some simple value definitions within a module.
Based on your earlier comment, we should be able to run ppx1 as a separate check, not as part of compile-time processing. At compile time, ppx2 and ppx3 should then work without regard to order.
My current workflow is about applying first rewriter and then second.
first rewriter searches for type declarations annotated as [@@first] and replaces them by two or three other type declarations. Original declaration is removed from the code in general case. Generated type declarations are annotated by [@@deriving...] (usually by copying all other attributes except [@@first] from original type definition).
[@@deriving second] generates the rest of the code.
Example. Before
type t = ... [@@first] [@@deriving blah]
after applying first
type t1 = FullyAbstractType of t [@@deriving blah]
type t = ... (* reconstruct original t using t1*) [@@deriving blah]
type t3 = ... [@@deriving blah]
and then [@deriving blah] is expanded.
Also, I want to mention that the original type declaration should be removed by syntax extension, and hence I can’t use Driver.register_transformation ~rule because it saves old definitions.
Thanks for the details. We have a very similar ppx rewriter inside Jane Street, except that instead of using a [@@first] attribute we put the whole type declaration inside an extension point:
[%%foo type t = ... [@@deriving ...] ]
(* or: *)
type%foo t = ... [@@deriving ...]
Using an extension point fits well in the ppxlib model, so I’d suggest to do that instead.