Ppxlib.0.22: an update on the state of ppx

Dear all,

We’re happy to announce the release of ppxlib.0.22.0, the fist release of ppxlib fully compatible
with OCaml 4.12.
The main and only feature of this release is the bump of the internal OCaml AST used by ppxlib from
4.11 to 4.12, allowing you to use 4.12 language features with ppxlib and any ppxlib-based ppx.
Note that ppxlib was compatible with the 4.12 compiler since 0.19.0 but that you couldn’t use 4.12
language features until now.

This is the third such AST bump release since we announced our plan to improve the state of the PPX
ecosystem here and we though it’d be a good time to report back to you and tell you
how things are going on this front.

For those of you who aren’t familiar with this plan, the goal is to upstream a minimal, stable,
ocaml-migrate-parsetree-like API on top of the compiler-libs called Astlib. It will allow us
to keep ppxlib and any ppx based on ppxlib compatible with OCaml trunk at all time.
To allow better performances and a clear compisition semantic, all the ppxlib-based ppx-es need to use
the same AST (as opposed to ocaml-migrate-parsetree based ppx-es) so from a certain perspective, this
plan simply moves the breaking API up one step, from compiler-libs to ppxlib.
In order to greatly ease the maintainenance of ppx-es and to prevent opam-universe splits we decided
that everytime we cut a breaking ppxlib release, we will send patches to keep the existing ppx-es compatible with the latest version and therefore with the latest OCaml compilers and language features.

While this seems like a tremendous task and a huge amount of work, dune and other tools that raised
in its wake such as opam-monorepo incredibly simplified this kind of work.

Ahead of OCaml releases, we prepare a branch of ppxlib with the upgraded AST. We then fetch
opam-repository to gather a list of sensible reverse dependencies (i.e. packages whose latest version
depends on ppxlib and is compatible with ppxlib’s latest version) and assemble a dune workspace with
a clone of each of those reverse dependencies, our ppxlib branch and all of their dependencies thanks to
opam-monorepo.
We then use dune to build all the packages we’re interested in and simply follow the compilation errors
until everything builds successfully with the new ppxlib.
What remains is to create PRs on the relevant repositories to upstream those changes, after which
maintainers have everything they need to cut a new compatible release.

Most of this process is automated using scripts but it still requires a bit of handiwork. We aim at
extracting tools to further improve this workflow and reduce the time and effort required but it has
been surprisingly smooth. Our experience with the 4.10, 4.11 and 4.12 upgrades so far is that
most reverse dependencies don’t need an upgrade and that it’s far less demanding for one person
to upgrade all the packages that need it than it would be for each individual maintainers to
understand the changes in the AST and do the upgrade themselves.

It’s worth noting that for this to work well, the ppx-es and all their dependencies have to build
with dune. We do maitain a separate opam-repository with dune ports of commonly used packages so in
practice most projects fall into this category but a few exceptions remain and they are therefore
not taken into account for this upgrade process.

We’re also trying to improve the tracking of the upgrade’s progress and for the 4.12 compatible
release we created a github project to have a list of all the packages we
considered and see where they are. We also keep track of the packages we had to exclude and why.
During this upgrade, we considered 80 opam packages, out of which only 4 needed to be patched and 6
had to be excluded from the process as we couldn’t reasonably get them to build in our workspace.

Once we have a better idea of what makes a package easy to upgrade we plan on releasing a set of
reasonable rules to follow to benefit from those upgrades, we’ll keep you updated on this!

All in all we’re pretty happy with this new process and although it needs to be refined, we’re
confident it can grow into something sustainable by creating tools and CI to support it. Hopefully
these will also benefit the wider community and help grow a healthier Opam universe.

27 Likes

Thanks @NathanReb for the update and to the rest of the Ppxlib team for your recent work :slight_smile: I’ve been able to take on more PPX dependencies recently thanks to the added stability, which is really great; I hate writing generate-able code by hand…

As a library maintainer, it’s especially exciting to see the PPX team developing such nice tooling for ecosystem maintenance. I look forward to a day when I can easily pull of my libraries’ reverse dependencies with opam-monorepo and use that to gauge API breakages and coordinate downstreaming of patches. Great stuff! :tada:

7 Likes

Curious about the current status of Astlib. I was closely following ppx at one point but it hasn’t seen much activity recently. Thanks for all your hard work.

2 Likes

It’s in progress. Not much happened in the past couple of months while we were finishing the port of a few projects to ppxlib and doing the 4.12 upgrade. But @pitag re-started working Astlib as of a week ago. You can follow our progression via the public meeting notes.

Note however that the ppx project was for our original goal or providing a “forever stable” API for ppx rewriters. It has been in pause since August 2020 while were trying the “upgrade the world” method, which as @NathanReb pointed out is working pretty well practice. At this point, it’s looking more and more likely that we won’t resurect the ppx project.

1 Like

This is unfortunate, though I understand the appeal of a centralized upgrade process. I had hoped to eventually use ppx’s for various project-local metaprogramming tasks. Absent the air cover that the ppx library was to eventually provide against API churn, the tradeoffs of doing so haven’t worked out so far; to quote from Convert from one complicated type(like AST) to another - #9 by cemerick

(Anecdote: to date, I’ve found it to be far easier to simply push code generation (producing types and associated data structures from predefined JSON files for example) to bespoke separate programs that Format.printf code to where it needs to go, and tied together with dune rules. Doing the same thing via ppx was just much more complicated and touched types and APIs that I knew I wasn’t going to recall in the slightest the week after, nevermind a year from now.)

Thank you for your efforts, and for clarifying the design direction in this case.

1 Like

A downside to this “upgrade the world” (hypothetical to me, but maybe
not for others) is that it leaves proprietary code in the dust.

1 Like

Yeah, that’s what I meant re: “project-local metaprogramming”. A centralized migration process is probably great (and maybe even automatable much of the time) for the 80% case of ppx’s that are implementing widely-useful language extension. Of course, such cues help to ensure that that is effectively 100% of the actual usage of the ppx mechanism.

Well, commercial users might also have more resources to run their own internal upgrade processes. This is certainly the case for Jane Street, where we have some internal-only PPXs, in addition to the ones we open-source.

To give a bit of context as to why we switched from the “forever stable” to the “upgrade the world” plan, the old ppx project was becoming technically too complex. We often had to remind ourselves how the whole thing was supposed to work. We were basically building something that was as complex as camlp4 was. Sometimes such complexity is justified, such as in the compiler or build system, but in this case it didn’t seem justified to us and we decided to go for a simpler design.

With our new plan, it will be easy for anyone to understand what we did and contribute. In the long run, if the set of people who maintain the ppx ecosystem changes, it will be easy for the new people to pick up. On the compiler side, the process for upgrading Astlib will be simple, and especially much more straightforward and easier to reason about than the process we would have ended up with the “forever stable” plan.

Finally, I’d just like to point out that upgrading the ppx world used to be painful because the ecosystem was very fragmented. As the ecosystem is becoming more unified, it is also becoming much simpler to upgrade the world, as Nathan mentioned in his post. In the long run this should benefit proprietary code bases as well.

6 Likes

Essentially we’re making a set of ‘blessed’ PPXs (i.e. all the ones on opam) which are updated in lockstep with the compiler. In a way, they’re virtually bundled with the compiler, as nothing else in the ecosystem is this compiler-dependent. Makes sense to me, rather than providing infinite backwards-compatibility across compiler versions.

That’s the idea. And in fact, there is no reason to limit this process to the compiler. Just imagine; you are a library author and want to do a major release of one your library. This a major release so it includes some breaking changes. You want to help your users adapt to the changes, and to do so you follow the following steps:

  1. press a button and opam presents you with a buildable workspace composed of all your reverse dependencies
  2. you cd into this workspace, run dune build and follow compilation errors just like you would in a single project
  3. once you are done, you press another button and opam submits pull requests to all the projects that you updated

How does that sound? :slight_smile:

Well, not quite. There are quite a few projects that are compiler-dependent, such as utop, merlin, odoc and more generally any project that uses the compiler-libs.

4 Likes

If you want to apply this idea to libraries in general, this sounds very similar to Titus Winters’ telling of the compatibility story of Google’s monorepo and Google’s open-source library Abseil (2018). He explains that the team changing an API is responsible for changing all the affected code at Google. Abseil on the other hand is open-source so in order to keep the possibility of making changes like they want, they state guidelines users must follow and promise to publish an upgrading tool if something really needs to break despite following these guidelines. This is a bit different but is the compromise they found to be able to make their code available to the community. The paper is a criticism of Semantic Versioning which is not seen as a real solution.

With the “upgrade the world” approach, on the plus side, whoever makes a change is confronted with the consequences of their change (so they may realise that it was larger than intended and decide to revise it), and there is also a sort of cache locality benefit whereby the one propagating the change is also the one who already understands it in detail.

On the minus side, I see several issues with the “upgrade the world” approach for general libraries:

  • The closed-world assumption: not everything that counts is on opam. Code written by individuals and research teams might not be published there even when published and open-source. A good proportion of my code only interests myself and will never be published. As for proprietary code, a startup might not have the same resources as large company like Jane Street.
  • Not every kind of breaking change can be caught with build and test failures.
  • The author of a popular library might realise that even the simplest changes are far too much work, so an “upgrade the world” approach with an unbounded set of reverse dependencies tends to degrade to the “no change” approach.

He admits this works better for large organisations like Google, and for shallow changes that do not deeply affect the semantics. So this may not be suitable for everything. There are good lessons in that paper whether or not you are convinced by the “upgrade the world” approach.

For PPXs (or at least a blessed subset) this sounds like a sensible choice and a great solution to a thorny problem!

This was my weekly backwards-compat post, thanks for reading it.

5 Likes