I’d like to share some insights into a few projects that I’m currently developing in OCaml. These projects are a blend of my interest for development tools and my desire to contribute to the community. I hope this overview sparks your interest and opens up opportunities for discussion or collaboration. Let’s dive in!
Dunolint
I’m working in a monorepo that contains hundreds of dune
files. It’s not massive, but that encouraged me to create a tool to check invariants in my dune setup and assist with ergonomic issues, such as applying automatically systematic changes across many dune files. It supports things like enabling instrumentation, configuring recurring lint or preprocess flags, sorting libraries alphabetically, etc. Recently, I’ve added support for dune-project
files, enabling me to automate workflows like updating dependency bounds for multiple projects within a monorepo simultaneously.
The code is currently in a rough state. I’m using it as a buffer to experiment and iterate quickly without needing to consult the dune developers for every minor ergonomic query as I learn more about dune and my requirements. At this point, I’m uncertain whether to consider it as throwaway code or whether it would be worthwhile to refine the scope and user specifications for a project of this nature.
Central
I appreciate the monorepos approach, but I prefer to publish projects separately, sometimes with varying visibility levels (e.g. public vs private repositories on GitHub). I was looking for a workflow that would let me combine the benefits of working with monorepos, while allowing bidirectional promotion of changes between the monorepo and the individual sub projects contained within.
I explored git submodules
a bit, but I needed more flexibility in editing the history between the internal operations (within the monorepo) and the final published commits in the individual sub projects. I found it more convenient to version the subprojects as part of the monorepo itself. This approach aligns more closely with git-subrepo
, which I used to build some early versions of this project. I’m gradually moving things to an OCaml implementation.
Diff4s
Diff4s is what you get when you rebase a branch under development. You started from a specific upstream revision (old-base = b1), reached a working HEAD (old-tip = f1), and in the meantime, the upstream moved to (new-base = b2). After performing your rebase or merge and resolving conflicts, you end up with a (new-tip = f2).
In this blog post, Yaron Minsky discusses patdiff4
, a tool that manages diff4s in Iron, a code review system used at Jane Street.
I contributed to early versions of patdiff4
and Iron
. However, it’s been some time (I can’t believe this post is actually 10 years old). It’s been a while too since I’m no longer working at Jane Street.
I recently started regaining interest in this topic and I’ve started developing a library that computes and manipulates diff4s for git
repositories. My goals are:
-
To create a standalone tool that can aid in reviewing complex rebases you might encounter locally.
-
To develop a library that I can incorporate into a more comprehensive code review system for git, inspired by Iron (see the following paragraph).
I don’t plan to focus heavily on rendering issues. Instead, my aim is to design the tool in a way that leverages the user’s git difftool and mergetool configuration, along with other custom strategies and third-party tools. (As an example, I recently learned about git range-diff
, which seems to render some diffs-of-diffs, and read up some ideas of side-by-side rendering for it).
Cr
As previously mentioned, I worked on Iron
and used it daily during my time as a developer at Jane Street. For those unfamiliar with Iron
, I recommend this public talk.
Nowadays, my development primarily involves git repositories, using a PR model and various GitHub
features like CI via workflow actions, etc.
Occasionally, I find myself missing certain aspects of Iron
, although I haven’t precisely identified what those aspects are. I’ve often contemplated whether elements of an Iron
-like workflow could be adapted to decentralized development on platforms like GitHub
.
Over the years, several people familiar with Iron
have considered similar ideas. In 2018, James Somers wrote a blog post envisioning what this could look like, viewing it through the lens of the editor integration.
I’ve begun prototyping a system that operates with git and supports diff4s. It tracks what you’ve reviewed in each branch and what you need to review when branches change. It also presents you with an aggregated “todo” dashboard across all your repositories, regardless of where they’re hosted.
I take great pleasure in acknowledging ‘Iron’ as a source of inspiration for my ‘cr’ project. However, it’s worth noting that due to the unique distributed nature of ‘cr’, the architectural similarities with ‘Iron’ might be minimal. ‘Iron’ is a centralized comprehensive system with numerous features, some of which I’m probably not even aware of given how long it’s been since I last used it. In contrast, my aim with cr
is to create a somewhat minimalistic layer to assist in tracking review states aggregated from many sources, with the intention of making it accessible to a wider audience. I anticipate that the two systems will evolve independently, without maintaining any specific ties.
Currently, I’m using cr
to monitor the progress of branches I’m interested in across numerous public git repositories, many of which belong to the OCaml community. This has been an enjoyable experience[^1].
At the moment the review metadata is persisted into a local git repository. I wish to redirect some of this information into the git repositories under review to enable collaboration and sharing of branch metadata (e.g. some json files pushed to a dedicated branch). I plan on using CRDTs
for this part.
An Example Combining the Tools
Let me share how I recently combined these tools to effectively upgrade my code base to the new v0.17
janestreet opam packages.
First, I created a cr
-branch in my monorepo where I used dunolint
to automatically edit all dune-project
files in my project. I made all the required changes to make the tree compile and reviewed the changes with cr
. Then, I used central
to automatically distribute the changes to each public repository managed via my monorepo.
Finally, I relied on diff4
to assist with rebasing other changes I had in progress across this upgrade.
In Conclusion
I hope to gradually make progress on this over the coming months, identifying and publishing reusable building blocks along the way (e.g. git typed api).
Embarking on this project has been a journey outside of my comfort zone. I’m not particularly experienced with open-source development, and this endeavor is shaping up to push me into new territory. Isn’t this where the magic happens?
I’m developing these projects as a part-time hobbyist without external funding. My prototypes are incomplete, flawed and not ready for public use yet. That being said, I’m open to early discussions and am interested in similar work happening elsewhere. If you have overlapping use cases or motivations, I’d love to hear from you! You can reach out to me here, or at any of the email addresses attached to my commits on GitHub.
Best regards, Mathieu
[1]: I leave you with a demo of a cr
session in the terminal: