Ppxlib maintenance summary

I recently started working on ppxlib again thanks to the OCaml Software Foundation and wanted to report back to the community all the work their funding made possible so far along with the plan for the next steps.

Know that OCSF is only funding me part time on this and that I’m open to more OCaml freelance work!

Summary of the work

Improved error reporting

Ppxlib has an -embed-error flag that is most useful to merlin as it allows it to always get an AST out of the driver and allows it to keep operating normally when a ppx raises a located exception (as in raised with Location.raise_errorf) as it would always get an AST out of the driver run.

The problem with this mode was that it didn’t try to recover from such exceptions and would stop applying transformations. The error was properly reported by merlin and it still had a valid AST to work with but none of the potential errors from subsequent rewriting or code gen would be reported.
This lead to a tedious workflow where the user would fix one error, save the file, see a new error reported by merlin, fix it, save, and so on.

There was a series of PRs by @Burnleydev1 long pending review that were fixing this by collecting all such exceptions while keeping on the rewriting phases using the last valid AST or node.

I reviewed and fixed those PRs (#447, #453 and #457) and worked on a couple fixes and improvements to error reporting related to this work.

Driver mode for dune

Dune has ongoing internal work to be able to use ppx in development. Since it cannot depend on ppxlib or any ppx at build time, their solution relies on using ppxlib and ppx normally in development but using already preprocessed copies of source files that require rewriting for bootstrapping, making their opam build ppx-free.

They require a special driver mode that writes to the output file only if any actual rewriting happened.

I worked on a first prototype of this using a pre-existing hook called for each generated AST node.

5.2 compat on trunk-support

The main task since I started working on ppxlib was the 5.2 support. As OCaml 5.2 is coming out soon and ppxlib is a central piece in the OCaml ecosystem, it’s important that ppxlib has a compatible version available for testing out the new compiler. Without it, a lot of opam packages can’t be built with the alpha and beta releases of the compiler because of their ppx dependencies.

ppxlib has a trunk-support branch that is meant to be kept up to date for such occasions. While it already contained the AST migration from/to the 5.2 version, it was out of sync with the main branch of ppxlib.

I rebased the 5.2 relevant parts of the branch on top of our main branch to be able to cut a first preview version with 5.2 compat and fixed a couple of bugs and quirks in the code base that prevented the test suite from running properly with the 5.2 support.

After the first release of the preview version we discovered a series of bugs with the help of @kit-ty-kate, @octachron and @anmonteiro. The most important among those being:

  • A bug in the round trip migration of the new Pexp_function node from 5.2 that was causing compilation errors when the function’s return type was coerced.
  • The new syntax for ocaml.ppx.context was causing driver crashes when reading some binary ASTs. I wrote a migration for those attributes that fixed the issue.
  • The driver was silently relying on the compiler to re-open files to display source quotation when reporting located exceptions. Since this was removed from the compiler in 5.2, I fixed the driver to properly set Location.input_lexbuf and re-enable source-quotation.

ppx_deriving maintenance

ppx_deriving is quite a central piece of the ppx ecosystem, especially for the set of standard derivers it provides (pretty-printers, equality and comparison functions, etc.).

The project was lacking maintainers with enough time to review an important PR migrating those standard derivers to ppxlib. This is important because it makes those quite broadly used derivers better integrated with ppxlib features and improves both performances and error reporting for their users as they are now applied as part of the main ppxlib driver AST traversal.

I reviewed the PR and cleaned up the repo a bit to attempt a release, something that has not happened for 3 years for this package.

The initial release failed for two reasons:

  • ppx_deriving.show’s deriver accepts an argument that specifies how the implementation should behave without impacting the function signature. In the ppxlib port we did not register this argument for the signature deriver since it had no impact on the generated code there. It turns out at least one opam package used the argument in an .mli file so we added it for compatibility as ppx_deriving duplicated the set of arguments for implementation and interfaces
  • ppx_deriving used to automatically register extensions for each derivers that can be used to inline the derived function for a given type in an expression. We preserved this in the ppxlib port but it caused a conflict with ppx_let. This conflict should be resolved on ppx_let’s side as they were declaring an extension named map without any possible prefix such as ppx_let.map which is the recommended way. Using a prefix allows the user to disambiguate if several ppx declare an extension with the same base name.

We had to cancel the release to fix those issues before attempting again.

General maintenance

There was also some regular maintenance such as improving our homemade expect test runner to be able to better run our tests across all supported compiler versions, reviewing all pending PRs, upgrading ocamlformat and cutting a release of ppxlib with the latest features.

Next steps

Along with the general maintenance of the project there are two things that would greatly reduce the maintenance burden on ppxlib and would improve the state of the ecosystem for the community that we would like to work on.

Upstreaming Astlib to the compiler

This has been in the works for quite a long time but the previous maintainers haven’t had the chance to see it through.

The plan is to upstream a small part of ppxlib into the compiler to ease the release process for new compilers.

This small library should contain the minimal subset of stable API ppxlib needs to properly function and, most importantly, the AST migrations from the current version of the compiler to the previous version and back. The idea is that trunk would then provide the migration to the latest released version at all time and ppxlib would be able to use those if it does not natively supports them yet. That would make testing the compiler on opam work without requiring a special release of ppxlib and would also ease the migration writing process as they would be written at the time of the AST change by the person who modified it and therefore that is most qualified to do so.

Indeed the migration writing process is time consuming at the moment because it is done by ppxlib maintainers when the next ocaml releasing is closing in and requires us to dive into changes we are not necessarily familiar with.

During the compiler release process, ppxlib would incorporate those new migration to the entire set of migrations it supports, allowing it to be compatible with the “new” trunk until the next release.

The core of this work is to formally write down this process and start the discussion with the compiler team. Once we agree on a plan, I don’t expect there is a lot of code to write for this as all of it already lives inside ppxlib.

Refining the release process for ppxlib with an AST bump

Part of the ppxlib contract is that all ppx-es written with ppxlib must use the same version of the AST chosen by ppxlib. This allows for better performances and semantics of rewriting. This version is usually the latest supported version as it is required for ppx-es to support all the new language features. It sometimes happen that the internal AST version lags behind if no new features
would be “unlocked” by upgrading the AST, as it has been the case since 4.14.

Every now and then ppxlib has to bump its internal AST though, potentially breaking its API and therefore a few of its reverse dependencies.

In the past few years, we decided to build the entire set of reverse dependencies every time were releasing such a change and to send PRs to fix revdeps to help keep the ppx ecosystem sane and avoid putting to much pressure on individual ppx-es maintainers.

We know that we will have to do this for the 5.3 release as it will be adding effects syntax.

The current workflow relies on building a “duniverse” i.e. some sort of large dune-project containing ppxlib and a clone of all its reverse dependencies. This proves quite a challenge as it is often hard to get everything to build together.
An ideal solution would be to rely more on opam-ci for this but in its current state it is not very reliable.

I’d like to spend some time on improving the process of building revdeps of ppxlib and submitting PRs and experiment to prepare for the next bump which will include the 5.2 changes to Pexp_function that we expect to potentially break quite a lot of reverse dependencies.