Generate typed AST fragments

What’s the best way nowadays to generate a well-formed AST (optionally check that it is well-typed) and serialize it to a string? These ASTs might use some external libraries (that are not necessarily directly available within the code that generates those ASTs).

I guess it’s a good use case for MetaOCaml, but is there any lightweight way to do this, for instance, with ppx? I’ve seen metapp and ppx_stage, but I’m not sure I need the full power of those. I want to keep the difference between code that runs at configure-time and runtime (because they correspond to different aspects of my software lifecycle) – so I just want to generate, at configure-time, a well-typed AST with some well-typed holes that will not fail at compile-time or runtime.

The use-case is to generate devices for MirageOS, where we need to generate runtime code for parsing command-line arguments or initialising devices. Right now, we have a mix of runtime libraries (where this runtime code lives and that needs to be linked with our program), some meta-programming shims for configure-time configuration (to generate some runtime code that will reference and manipulate these libraries and runtime functions) and some poor-man configure-time code generation (using strings). I am sure there is nice/clean way to do this :slight_smile:

2 Likes

For simple untyped AST construction and manipulation, I think ppxlib’s metaquot ppx is quite nice. Once a Parsetree type is constructed it can be easily serialised to a string with Pprintast.

For example, that’s the approach used by ocaml-swagger to generated HTTP+JSON bindings for APIs.

For typed code generation and stage differentiation it might be harder to find a simple off-the-shelf solution. From my understanding, the difficulty is that MetaOCaml and inspired variants like ppx_stage are great for generating typed expressions, but not entire programs.

3 Likes

Building on @rizo’s answer, once you have a Parsetree you can build a Typedtree with compiler-libs as long as you are willing to predefine the referenced types that the code needs. That is how I ported some of the BER MetaOCaml code to PPX.

Quick walkthrough:

  1. The input type type 'a code = Parsetree.expression and the output type type closed_code_repr = Typedtree.expression in trx.ml.
  2. The transformation signature val close_code : 'a code -> 'a closed_code in codelib.mli.
  3. The implementation of close_code including the definitions of external modules that your Parsetree AST is type-checked against. codelib.ml

All of that assumes a closed environment that you can type-check against. From what I understand of MirageOS configuration that should be plausible.

3 Likes