Maintenance bottlenecks in the compiler distribution

Hi everyone,

This is a public announcement that we are experiencing a maintenance bottleneck in the development of the OCaml compiler distribution (the github/ocaml/ocaml repository).

Our development process naturally generates a fair amount of maintenance work to, among other things, discuss and integrate proposed patches, fix bugs, and react to feature requests. We don’t have enough people doing this maintenance work; currently the vast majority of this work is being done by about 5 people: David Allsopp, Florian Angeletti, Nicolás Ojeda Bär, Xavier Leroy, and myself.

Despair not! Bug fixes tend to be prioritized and handled quickly; I believe that the OCaml releases remain of satisfying quality. But other aspects are affected negatively, for example:

  • our ability to react to proposed changes in a timely manner,
  • the experience of people trying to contribute to the compiler codebase,
  • various potential improvements that get stalled by lack of manpower to work on them.

Context

The OCaml compiler distribution moved to Github in January 2014. Since then, maintainers have been constantly complaining that there are more people willing to submit changes/PRs than people willing to review them, creating a bottleneck on the reviewing side. (We point this out in the first section of our CONTRIBUTING.md document.)

But the effort to upstream Multicore OCaml has unfortunately made the situation worse, for at least two reasons:

  • Integrating the completely new Multicore runtime required a lot of review, integration and documentation work. We onboarded experienced Multicore developers as upstream maintainers and this helped smooth out the process, but we have still been less available with other maintenance tasks that piled up in the meantime.

  • The sequential glaciation indirectly reduced the maintenance workforce. In November 2021 we stopped merging non-multicore-related features in the development version; as a result, various maintainers and heavy contributors moved away from working on the main development branch, to do their experiments in separate repositories (which is completely fine), and also more or less stopped following issues and performing maintenance on the main branch (which aggravated the maintenance issue).

Now that OCaml 5 has been released, many contributors will be coming back with many exciting change proposals to upstream. At the same time, our users are playing with Multicore features and will soon find countless bugs to fix, limitations to lift, etc. It’s not easy to play OCaml maintainer right now.

What can people do to help?

Contribute to the maintenance effort

Heavy contributors, in particular core developers but not only, should be expected to participate to this collective maintenance effort. We are having discussions right now about our expectations.

In my personal opinion, anyone who dedicates a substantial portion of their time to working on code intended for eventual upstreaming should dedicate a fraction of this time to collective maintenance of the upstream development trunk. (10%? 20%? Something like this.) This is the most healthy way to ensure that the volume of maintenance work scales with the volume of submissions. (If you are paid by someone to work on the compiler, please make sure that your pay also covers this maintenance fraction.)

Occasional contributors who would like to help with OCaml development should also consider whether they can help with this. (No pressure!) We have several instances of people helping with code reviews, triaging, helping make decisions on design questions etc. (I remember nice contributions in this direction from Daniel Bünzli, Gabriel Radanne, Nathanaëlle Courant, Favonia, Guillaume Munch-Maccagnoni and Kate for example.)

Generate less maintenance work

If you interact with the compiler distribution as a software project, please be mindful of the maintenance load that you generate.
If you send a Pull Request, make sure that its purpose/justification is explained very clearly, that it is easy to review; that the benefits of the change (explain those clearly) outweigh the long-term and also the short-term costs of integrating it.
Similarly for feature requests or enhancement proposals: now is the time to focus on the uncontroversial things that are clear improvements, and to justify, explain them very clearly.

How?

It may not be immediately clear to people what “contributing maintenance work” means concretely. Right now I see three obvious approaches.

  1. Subscribe to github/ocaml/ocaml notifications and jump in when you want.

  2. Look at our issues (258 open as I write this) and see whether you think can help. Maybe some are out of date / irrelevant and could be closed – say it. Maybe some bugs could be fixed, or some enhancement requests could be fulfilled. If you can, give it a try. It’s best to start with issues where the desired outcome is consensual (a clear bug to be fixed, with no immediate downside; a small interface improvement that does not introduce much complexity and is well-justified; etc.), rather than work on some weird syntax proposal that will in turn require ample discussion and may be turned down in the end. (If you find a wonky proposal that failed to gather consensus and probably never will, it’s actually helpful to suggest closing the issue.)

  3. Look at our pull requests (246 open as I write this) and try to see whether you can help. Again, it’s best to focus on PRs where there is a clear motivation/need. Look at the code, feel free to ask questions on things you don’t understand or comment on aspects you don’t like so much. If the PR is stale, maybe it should be rebased (would you like to give it a try?), or there isn’t much that can be reused and it could be closed – feel free to say so.

We have received the feedback that some people are still unsure what to do. In the upcoming weeks (probably in January) we will have more discussions about how to organize maintenance, to find more focused processes that encourage people to contribute in this way. I don’t think that there is a silver bullet, a magic process that will make it much easier, so I would encourage anyone interested to first try those three basic approaches above and see if one works for them.

In my experience people often self-censor and do not try to react to PRs or issues that are not in their area of expertise. But most of the compiler codebase is in only a very few people’s area of expertise, the rest of us (myself included) just make do with their imperfect understanding and try to help anyway. Do not hesitate to walk into issues outside of your comfort zone, it is a great way to learn about the compiler distribution codebase.

Happy maintaining!

27 Likes

Another possible thing to do is:

  1. Filter the issues/PR you opened yourself and assess whether it’s still relevant or reasonable at that point in time or for the forseeable future (ideas and pie in the sky stuff can be discussed to death on this forum).

I remember doing this a few time on the opam project and ended up closing a few things I had requested a few years earlier.

8 Likes

Note that many (most?) open-source communities have faced similar issues, there is a thread on the Rust discuss about the exact same issue:

Apparently the approach that Rust people are taking is to have people subscribe as “reviewer candidates”, and have their Github bot assign new PRs to one of the reviewer candidates at random, with greater priority to potential reviewers that have a short review backlog. It is interesting that, at least in that discussion, there seems to be no emphasis on the idea that people that submit PRs should also contribute to review work.

The whole narration around “contributing”, “PR welcome!” in OSS completely eluded the fact that making a PR is only a tiny and the easiest fraction of the work needed to integrate suggestions from random people from the internet into a cohesive result.

Maybe one thing you could try to add is a kind of tit for tat statement in CONTRIBUTING.md. Something of the form (better said) we are happy to take you work, but integrating and maintaining it takes time and ressources, there is a social expectation that for any PR merged of yours you will make a full review of another PR.

5 Likes

It’s not obvious to me, what should be done after successful local rebase? Unlikely, any random person have rights to force push to every possible PR. Opening a new rebased PR will create more maintenance work, and will probably complicate reading of previous discussion.

If the PR has not had any activity in a long time, it is likely that the original author has moved on, but it would be rude to assume that. So I would leave a note in the original PR both asking whether the original author is interested in resuming work on the issue, and to inform that you have rebased the work in a different PR (no need to wait to open the separate PR, it can always be closed if needed).

Either the author will respond, or not, in which case the original PR can be closed and the discussion can be continued in the new one. Generally speaking, the original author should be credited in the new PR (and in the Changes entry when/if the PR is merged).

Lastly, note that sometimes PRs become stalled because of genuine issues with the design or implementation of the code. Absent a resolution of those issues, rebasing the PR will not really get it any closer to being merged.

Cheers,
Nicolas

1 Like

I would:

  • push the rebased branch on my remote fork
  • post it on the stale issue, asking the author if they are willing to adopt it
  • after a few weeks without a reply, go ahead and submit a new PR and invite maintainers to close the previous one
1 Like