Can MetaOCaml like quasiquote syntax be used for writing ppx rewriters in some way?
To be clear, I don’t mean that MetaOCaml should be merged. The staging semantic of MetaOCaml requires non-trivial changes to the compiler and runtime. But I do wonder if the quasiquote syntax of MetaOCaml can be ported to make ppx rewriters easier to write and read.
The current system of ppx is based on AST rewriting, which is much more precise and structural than text-level manipulation. However AST transformation functions are clearly harder to write and read than ordinary OCaml programs, and knowledge on the internal AST representation is required.
Then I come up with the idea of “why not use quasiquotes”. Currently I have come up with the following pros and cons of it:
-
Perhaps most importantly, quasiquote syntax plays well with the current model. It needn’t replace the ppx AST rewriting mechanism. In fact it can be based on it. As quasiquote expressions have syntactic meaning just like ordinary OCaml programs, they can be easily parsed and translated to AST transformation code. Ultimately no changes at all need to be applied to the current compiler, and even existing ppx facilities. We can have a program that translates quasiquoted programs to ppx rewriters, similar to ocaml{yacc,lex} and mehnir.
-
Quasiquote is more stable than AST. I know that there has been efforts making ppx rewriters work on a unified and more stable AST, but again, quasiquote can coexist with all these efforts. Additionally, the stability of quasiquote is identical to that of OCaml syntax, which would have far less breaking change. The job of fitting different versions of ASTs, if using the ocamlyacc-like approach mentioned above, would be restricted into the transformer itself – author of rewriters don’t have to care about it.
-
Quasiquote is more readable and intuitive. AFAIK this is a obvious fact. The MetaOCaml syntax of quasiquote introduces only three extra constructs. In terms of ppx, only bracket and escape would be necessary. Optimistically, these concepts can be explained to someone completely fresh to meta-programming within 10 minutes, as long as he/she knows the OCaml syntax. An extra bonus is that the existence of MetaOCaml can serve an an evidence that the quasiquote syntax is compatible with existing OCaml syntax.
However, there are some cons, or design issues related as well:
- Expressiveness. Template Haskell uses quasiquote, but it exposes internal AST as well, because quasiquote is inadequate in expressiveness. The issue concerns non-inductive lists, for example, distinguishing
(x, y, z)
and(x, (y, z))
. In terms of OCaml, at least tuples (both expression and pattern ones) have the problem. Perhaps internal AST constructors is necessary here. I am not sure if directly providing extra syntactic constructions for these problematic ones is capable. For example,:
let values = [.<x>.; .<y>.; .<z>.] in
,<.~@(values)>.
where .~@( ... )
is a special syntax for n-tuples. Another approach would be adopting the `slice a list’ (",@") operator of lisp.
-
Pattern matching against expressions. MetaOCaml, for the sake of type safety and freshness, etc, does not allow deconstruction on expressions. But for ppx rewriters, this is necessary. Since ppx rewriters don’t care about the issues above, the only problem would be a clean syntax for this. Pattern matching is undoubtedfully the best. But there are some problems as well:
2.1. Distinguishing variable abstraction and constants. For example, if one wants to match a the attribute
someAttr
on the expression[@someAttr someArg]
, he/she would not get what is intended using the ordinary OCaml syntax for pattern matching, assomeAttr
in the pattern would be a variable abstraction, and would match any attribute. The solution would be adding a pattern syntax for “constant variables”.2.2. Expressiveness concerning patterns. Just like the expressiveness problem on expressions above, For example, how to distinguish two patterns that:
a. match a function with exactly one parameter, binding x to the parameter
b. match a function with arbitrarily many parameters, binding x to a list of all parameters
However, the same solutions to the problem of expressions should apply as well.
The most promising part of using quasiquote is its compatibility with the existing world, so I wonder if any similar ideas have been discussed before, and what is the problem if it is finally considered unusable. At least for me, the “external DSL for meta-programming” approach, like ocamlyacc/mehnir for parsing, seems like a promising addition to the existing ppx ecosystem.