Generate functions at compile time

I am new to the OCaml metaprogramming ecosystem. I would like to generate at compile time a structure with some functional values. What would be the simplest way to achieve this? Which PPXs should I look at, or should I go with compiler-libs? Do you have a reference to some sample code to get me started?

You probably want MetaOCaml. Or its spiritual successor: MacoOcaml (see below).

No, that’s for generating compiled code at runtime. For generating at compile time, you need MacoCaml. For some reason I cannot find the implementation link…

MacoCaml: Staging Composable and Compilable Macros | Proceedings of the ACM on Programming Languages

Hi @borisd,

Between writing a PPX or using compiler-libs, definitely a PPX! To write a PPX, you’ll want to use ppxlib. Given that you’re planning to generate new code (rather than replacing code), you’ll want to write a deriver. You can have a look at the dervier example on the ppxlib repo and the documentation on ppxlib, particularly the section on how to register a deriver. You can also have a look at some of the Jane Street derivers, which are all very hygienic, e.g. ppx_compare

To come to your concrete example:

I would like to generate at compile time a structure with some functional values.

As a basic starting point, you could derive a module called My_generated_module containing an identity function (not checked):

let derive_id ~ctxt _type_declarations =
let loc = Expansion_context.Deriver.derived_item_loc ctxt in
let id_fun = [%stri let my_id x = x] in
[%str module My_generated_module = struct [%%i id_fun] end]

let () =
let generator = Deriving.Generator.V2.make_noarg derive_id in
Deriving.add "id" ~str_type_decl:generator |> Deriving.ignore

That example is using metaquot, so apart from adding a (kind ppx_deriver) field to your PPX’s dune file, you’ll also need to add a (preprocess (pps ppxlib.metaquot) field.

Writing a PPX (concretely, a deriver) would be the well-supported and established way of doing what you want. About the other options that have been mentioned:

We definitely don’t recommend using compiler-libs. It’s explicitly unstable, has a very restricted API, and is not compiler cross-version compatible.

MetaOCaml is very cool, but serves a different purpose. As far as I understand, it’s more meant for runtime optimizations etc.

MacoCaml is definitely very cool as well. It’s still quite new and in an experimental phase, but it looks very interesting to me (btw, I’m a ppxlib maintainer). In case you’re interested in going a more experimental route, let me know and I can ping the right people to let you know which compiler fork etc you’d have to pin.

3 Likes

Btw, what did you read that made you think compiler-libs might be a good solution?

Hi @pitag,
Thank you very much for your detailed comment. I read about using compiler-libs for this use case in a comment in a related thread.

Note that in my vulkan use case, the source is the “xml” “specification” of the vulkan headers, and I am still using metaquot as much as possible. (And in this specific use case (even if it might be because I am compiler developer) compiler-libs has been much stabler than the vulkan specification.)

If the code is derived from ocaml source, ppxlib is a better solution in term of maintainability, because it covers the parsing side.

To generate simple OCaml (types, maybe some basic functions) I’d say that the simplest is a dune rule and a program that uses printf (yes, printf) to emit code. For example if you’re reading some basic data in a XML file:

(executable gen)

(rule
  (targets foo.ml)
  (deps (:file foo.xml))
  (action (with-stdout-to %{targets} (run gen.exe %{file})))

and then in gen.ml:


let emit_type (ty: some_type_description) =
  Printf.printf "type %s = {\n" ty.name;
  List.iter (fun field -> …) ty.fields;
  Printf.printf "}\n" 

let () =
   (* parse the input file, json, xml, whatever, if needed *)
  let things = … in
  Printf.printf "(* auto-generated, do not modify *)\n\n";

  List.iter emit_type things

This can be quite straightforward and doesn’t add any fragile dependency to your program. To emit really complicated code you can use ppxlib or a custom DSL for this use case.

3 Likes

What you’re saying, c-cube, is quite similar to using cinaps, right? It will conflict with ocamlformat and similar, and the generated code is standalone rather than derived code for any say type, but it works well when you want something very simple.

I think cinaps is a way to do that inline in a file, but I’ve never
actually used it. It seems to be very convenient for repetitive code
patterns, the kind of things for which I’d use a basic macro in rust.

What I’m suggesting (indeed for simple cases, or codegen tools that
mostly produce, say, types and printers) is really staged in the sense
that a .ml file is used to write code into another .ml file via a dune
rule. It’s a good compromise for things like generating sum types from
data, or similarly straightforward types from data. I think the
generated code could be formatted by ocamlformat but I generally don’t
commit it at all, it lives in _build only.

1 Like

In my experience, if your “source” is an external definition or file that is not in OCaml, then just generating the OCaml code as files and compiling it is simplest (and compiles faster) than writing an PPX. On the other hand, if you are using OCaml source code to derive your code definitions, then a ppxlib deriver (as suggested above) is the right direction.