I have been working on xobl, a “protocol compiler” for X11 which essentially reads the XML to an AST and applies a series of transformations to infer typings information, so that the backend(s) can use it to produce somewhat idiomatic X11 bindings.
It already kind of works, only missing some more passes and the cruft needed to establish a connection, but I’ve been at it for almost two years now and by now I feel drained just thinking about working on the damn thing; I’ve rewritten the middle-end (the AST transformations) part a few times and I’m never quite satisfied with it.
There’s a lot of passes, many of which depend on the previous ones, each one making some slight change to the AST which might or might not result in having to walk through the whole AST to catch all occurrences of that particular node. Clearly you’ll want to encode semantic errors in the types, so each pass ends up having its own unique AST, each depending on the previous one. To change a single node deep in the AST I have to write about a hundred lines of types and mapping functions’ worth of boilerplate. Any change in the lower levels of the AST bubbles up to the higher ones, and refactoring becomes a nightmare.
I think my method has some strengths, but it’s way too brittle and susceptible to change for my liking.
I’ve been thinking about making the middle-end mostly “untyped”, and restoring the encoding of semantic errors in the types only in the final pass just before it’s handed over to the backend. This would allow me to write the mapping functions once and reusing them for every pass, in a similar way to
Ast_mapper from the OCaml compiler, sacrificing some type safety.
The other solution I thought of would be to make most types generic, so that each pass would still have its own AST, but I’d still be able to reuse the mapping functions from the previous passes, but I’m afraid that would quickly become a mess of its own.
What do you think?