I’m working on router in which I would describe routes using S-expressions (sexplib), for example see the screenshot below. What is the best way to translate S-expressions in OCaml while building (I use Dune/Jbuilder)?
Beyond the basic printf based generator, a simple solution might be to use the compiler-libs library to create a small custom compiler (or you could just generate OCaml ast) and then add the corresponding rules and dependency to dune.
Why not simply write a code generator that parse the sexp and saves it to a file as an ocaml software? Then in dune you can define a rule target for the output file that calls this executable (that itself could be defined as a target executable). See for example: https://github.com/xapi-project/xen-api/blob/master/ocaml/xapi/jbuild
Indeed, generating ast (with compiler-libs + ppx_metaquot) has a higher entry cost that simply concatenating strings, but it can be more reliable and maintainable in the long term.
This is exactly what I want to do (see IMPORTANT section in the bottom of my question). My question was about code generation itself. I thought about simple string concatenation and was asking about alternatives to that approach.
@octachron has recommended to take a look at compiler-lib which looks very intriguing to me but, as for an OCaml newbie, it looks like substantially more work. I think I will use stupid simple string concatenation for the first version but I’ll keep in mind compiler-lib for refactoring in future.
If generating OCaml code without location directives pointing to the source is acceptable to you, you can use the approach that we use in atdgen. Instead of writing just printfs and trying to keep track of the indentation, we produce a tree of lines of code.
You can define your own, it’s straightforward:
type t =
| Line of string (** single line (not indented) **)
| Block of t list (** indented sequence **)
| Inline of t list (** in-line sequence (not indented) **)
The functions that generate OCaml code return a t list rather than writing to a buffer. Here’s a random sample:
let l =
insert (Line "Bi_outbuf.add_char ob ',';") (Array.to_list a)
in
let op, cl =
if p.std then '[', ']'
else '(', ')'
in
[
Block [
Line (sprintf "Bi_outbuf.add_char ob %C;" op);
Inline l;
Line (sprintf "Bi_outbuf.add_char ob %C;" cl);
]
]
Yes, except that the solution I’m proposing here produces subpar code (long lines, wasted whitespace). The goal here is to increase the readability of the generator’s source code, at the expense of reducing the readability of the generated code. It’s a worthy trade-off when the generated code is boilerplate that rarely has tricky bugs. Also, Format can be slow on some pathological input. I don’t know if it’s typical on generated code, but it’s nice to know it won’t happen.
For higher-quality output, we have easy-format, which is a functional wrapper around the Format module. Even though I’m the original author, I find it too cumbersome and not worth the effort if the user isn’t going to read the generated code.
I wouldn’t worry about indentation so much. Why not use a tool such as ocp-indent, either when you want to read the file or as the last stage of your generation process (so that location directives are correct)?