A Versatile OCaml Library for Git Interaction - Seeking Community Feedback

Dear OCaml community,

I would like to talk about a project I’ve been working on: Vcs, a versatile OCaml library for Git interaction. The library provides a direct-style API for interacting with Git repositories and is designed as an “interface”, or “virtual” library with the actual implementation provided by other component dynamically dispatched at runtime (similar to Eio vs Eio_main). This design aims high flexibility and adaptability to different use cases.

The Vcs library is designed to be backend-agnostic and concurrency-runtime independent. It’s compatible with both Eio and OCaml Stdlib runtimes.

My focus on the backend implementation has been pragmatic as of yet. I’ve concentrated on implementing a provider called git-cli that wraps the Git CLI, running it as an external process, interpreting its exit status and parsing its output. This approach has allowed me to focus on the core functionality and design of the Vcs library. I plan to explore the feasibility of supporting, in a similar fashion, luv and possibly miou as separate future work (still via the git-cli provider).

The project is currently in the draft stage in a private repository, and I am gradually working through the process of open-sourcing the dependencies for this project. The API it supports is very incomplete, and I am extending incrementally as I need more.

While the code isn’t published yet, I’ve written a public README that outlines the project’s architecture, design principles, and motivation. You can read it here.

I’m reaching out to the community for feedback on the approach. I’m particularly interested in your thoughts on the use of a provider-based parametric library and the granularity of the provider interfaces via Traits, but I would be interested by other feedback you’d feel like sharing. Please feel free to open issues, or asking questions or contacting me if you are interested by the project.

Additionally, I’m in the process of selecting an appropriate license for the project. I think it is fair to respond to that “We as a community cannot give you legal advice on that matter, you should contact with a lawyer”. On the other hand, if either you or someone you know have spent a lot of times thinking about the subject of LICENSEs for the OCaml ecosystem, it would be of loss for me not to talk to you. I’ll be grateful if you do reach out.

Thank you in advance for your time and feedback. I look forward to engaging with the community on this project.

Best, Mathieu

5 Likes

Such a library could be useful. At semgrep we developed our own poor’s man interface to git, semgrep/libs/git_wrapper at develop · semgrep/semgrep · GitHub but I’d rather switch to something more complete.

I took the same approach (exec + parsing stdout) for my very limited wrapper.
I had the ambition to migrate it to the git library from mirage git 3.15.0 (latest) · OCaml Package but it never happened since the first approach was working well enough :grimacing:

I would definitely be interested in such a library. Do you have a timeline for when you would release it? If the main blocker is simply choosing a license, assuming that the ultimate choice would probably be rather permissive I don’t see a huge harm in making the source available to view.

I am less familiar with the specifics of licensing in the OCaml ecosystem, but I have given it some thought generally. If you want a permissive license, I would support Apache 2.0 (alternatively, licensed at the licensee’s choice of Apache 2.0 OR MIT, if you care about GPLv2 compatibility). As I understand it, LGPLv2.1 + OCaml LGPL linking exception is popular in the OCaml ecosystem, but would be less attractive to me personally due to, among other reasons, the lack of an explicit patent grant (not that I anticipate your library to somehow be patent-encumbered, but it is nice to not need to think about it), the terms of section 2, and the comparative lack of discussion of contributions. The last point being less relevant if you also intend to have a CLA. But I am not a lawyer, and free advice is worth what you pay for it.

1 Like

What kind of API are you planning to expose to users? Is the goal to version application data using Git or interact with the Git CLI in a type-safe way?

If it is the former, I’d be interested to compare your API with the one from Irmin :stuck_out_tongue: The API was designed before effects were around, so it is full of Lwt, but we are in the process of cleaning this up with effects.

4 Likes

Thank you, @kopecs, for your input regarding the license! I appreciate it :slight_smile:. I do like the terms of “LGPLv2.1 + OCaml LGPL linking exception” for this project. As I understand it, this allows other projects to list the opam package in their dependencies. As long as they use a released version, it doesn’t add constraints to the licensing of their project. You’ve raised an important point about the patent grant. I plan to look into this topic next.

I vaguely recall reading that the latest version of LGPL 3.0 has a built-in “linking exception”. I haven’t had the chance to explore this further. I’m curious if OCaml projects licensed with “LGPLv2.1 + OCaml LGPL linking exception” chose this license because 3.0 wasn’t available at that time. Or perhaps there’s something specific about 2.1 that makes it more appealing than 3.0 for OCaml somehow. If the built-in aspect of the linking exception in 3.0 doesn’t hold up, I wonder if anyone has ever used something like “LGPLv3.0 + OCaml LGPL linking exception”.

In relation to this, I plan to reuse some small bits of code from another project licensed under Apache 2.0. Therefore, I want to ensure the LGPL license I select is compatible with that.

@samoht, my intention is for Vcs to be a library that interacts with the Git CLI in a type-safe way. I’m interested in Irmin and hope to try it out in the future! I have no plans for a project that overlaps with Irmin functionality, as far as I understand it (I started reading through the documentation a little while ago). A direct-style Irmin sounds very exciting, thank you!

I fear that implementing Vcs purely on top of a Git CLI may become limiting at some point. Therefore, I’m considering an architecture that allows for writing now libraries that depend on it, and later, when another backend is available (I’m optimistic about using ocaml-git here), there should be no need to change the user code. That’s one of the goals, anyway.

1 Like

I came across an interesting and related discussion on the forum:

In this thread, I noticed the combination of LGPL-3.0 + linking exception used in the caqti project. This provides a good case study for me to base my research on.

@paurkedal, I hope you don’t mind me asking – were you satisfied with the COPYING.LINKING file that was included in the end?

Thank you all for your time and input.

Yes, from by point of view, I am happy to use a license exception that is designed for the core license. It was hard to understand how the OCaml linking exception was to be interpreted in the context of LGPL 3.0, so the linking exception offers peace of mind in lack of official recommendation from someone with legal background.

I should add that there was an issue with license parsing in opam for this exception, but this was fixed when I sent the opam-repository PR for Caqti 2.1.

1 Like

For what it’s worth, most of the MirageOS libraries use the ISC license. Our goal is to encourage use in any context to ease adoption – unfortunately code having GPL and LGPL licenses will frighten many corporate lawyers and will need some extra level of persuasion (and in some cases will be a lost cause).

1 Like

I’ve recently pushed updates to the vcs public repo with most of the contents of my early draft. For those interested in early experimentation, I’ve created a release on my custom opam-repository.

The interface is still very a work in progress, but you can already see how the pieces fit together. In particular, the provider component, which is crucial for the dynamic dispatch implementation of vcs, is now available on opam. The vcs project serves as a good real-world example of the capabilities this provides.

Please feel free to open issues on GitHub with general feedback, requests, or to start a discussion.

@kopecs, I don’t have a precise timeline for an initial publication on opam yet. I’ve created this milestone if you’d like to follow the progress or leave a comment. Thank you for your interest!

@paurkedal: Your setup has been a great source of inspiration for me, and I’ve found it incredibly helpful. Thank you so much!

@samoht: I chose the approach that felt most comfortable for this particular project, but I greatly appreciate your input. I’ll definitely keep your suggestions in mind for future projects. Thanks!

1 Like