Proposal: care more about OCaml bindings for popular libraries

arbipher · February 18, 2023, 6:43am

Yesterday I was mentioned in an LLVM RFC on disabling and removing OCaml binding from the source. I feel it’s better to forward the message here to a broader audience.

Whether an OCaml binding should be on-the-tree or off-the-tree was discussed in my previous post on z3. However, I believe this time it’s not due to a (theoretical) topic on which should be better, but it may not work and will be moved. It’s more worrisome during the OCaml 5 era because of some breaking changes in C API.

Before tackling this problem, the first thinking in my mind is OCaml platform makes sure some common libraries e.g. opam, dune, ppxlib, etc will work. The second thinking is there do have a health check for opam packages (http://check.ocamllabs.io).

Maybe we can extend the care to OCaml bindings in these libraries. I can think of some problems here:

The breakings occur at their truck branch. It’s much earlier before the binding update in opam. The idea to solve this may be to build up some CI to monitor that. I am feeling most libraries won’t change a lot on how to build and install it. Therefore one CI can last for enough time.
Subtle platform-related building/linking problems. My experience may limit to {z3, llvm, and ocaml-torch} * {ubuntu, wsl, macos} but I believe I had encountered enough similar problems. It’s better that the care cover these common platforms.
We may need an OCaml binding maintenance Guidelines that residents together with the OCaml document/tutorial on C API. It can be helpful to other our community or other community.

(Please allow me to make my un-humble 5 cents) I am confident to solve these binding problems on the technical side, however, I need mentoring on how OCaml platforms and ecosystem works.

kayceesrk · February 20, 2023, 3:00am

Thanks for the note @arbipher. @tmattio has been thinking about the larger platform roadmap and may be interested to chime in.

and from the original thread

OCaml is used to be steady on the C API and FFI. OCaml 5 makes unavoidable breaking changes to support algebraic effects and multicore. I also wonder if there is better practice to making/maintaining the binding and let me check for that.

This is not correct. Sequential programs running on OCaml 4 can be moved to OCaml 5 without any breakages. This was a very explicit design choice. The only issues will be unrelated, deprecated functions and features finally removed as part of the major version bump. If you have concrete examples of breakages, please let me know and I am happy to have a look.

alan · February 20, 2023, 4:31am

Pure OCaml 4 programs still work with OCaml 5, but the FFI has a breaking change: naked pointers are no longer supported. The LLVM bindings use naked pointers, so I have been authoring a patch to replace the use of naked pointers so that the bindings work on OCaml 5. My patch currently passes the tests, so I feel optimistic, but it’s a lot of code to review.

kayceesrk · February 20, 2023, 6:26am

You are right. No naked pointers mode predates OCaml 5 features, but it is OCaml 5 which exacerbated the elimination of naked pointers.

lambda_foo · February 21, 2023, 5:57am

Thanks for posting here @arbipher.

My team at Tarides provides the infrastructure and CI systems for check.ocamllabs.io, opam.ci.ocaml.org and ci.ocamllab.io (ocaml-ci). ocaml-ci can build a project hosted on GitHub or GitLab that uses a standard opam / dune build setup. I suspect the llvm bindings might not be so standard .

What does the llvm ocaml build setup look like? Could you provide a pointer to the source, I am not familiar with the project. In general we can build any Dockerfile on Linux and some restricted subset on macos using ocluster. It might be possible to hack a custom pipeline if I can understand the build setup and express it as a Dockerfile.

arbipher · February 21, 2023, 10:32pm

Here is my summary for building LLVM itself and its OCaml binding.

(basic)

LLVM(llvm/llvm-project) is an umbrella monorepo repository for the sub-projects llvm (core structures and LLVM IR), clang (compiler) lld (linker), etc.
The majority of LLVM is written in C++ and it also provides an C library that wraps the C++ libraries.
The sub-project llvm is the core structures providing common structures and LLVM IRs that other sub-projects can use.
OCaml llvm binding on opam is built upon the llvm’s C library.

(On building LLVM from source)

Building LLVM takes two steps after cloning llvm/llvm-project.
4.1. Generate the project to build in a build system via cmake -G ninja <lots of parameters>.
4.2. Build it. Which sub-projects to build depends on the default settings and the parameters provided.
Currently, building OCaml binding is turned on by default (LLVM_ENABLE_BINDINGS is On).
OCaml binding is one of the official bindings, so in (4.1) if the condition check passes, the OCaml binding will be built in (4.2):

# Here is the log excerpt for condition check in (4.1)
# i.e. whether you have installed the correct OCaml toolchains and libraries

## case disabled

-- Found OCaml: /Users/ex/.opam/ocaml5/bin/ocamlfind  
-- OCaml bindings disabled, need ctypes >=0.4.

## case enabled

-- Found OCaml: /Users/ex/.opam/ocaml5/bin/ocamlfind  
-- OCaml bindings enabled.

The LLVM RFC is on change building the OCaml binding from default On and default Off and move it to peripheral-tier. It doesn’t not directly impact the OCaml binding but it’s a bad signal.

(On building LLVM binding from opam)

The opam package llvm is an incremental building upon step 4. It depends a virtual package conf-llvm that requires a system-level installation of common LLVM libraries. This system-level installation performs step (4.1) (4.2) and may usually not OCaml related. Then opam package llvm will clone and generate the building project for OCaml binding, with the system-level LLVM libraries to avoid building the OCaml binding from scratch.
The source code for LLVM binding is on-the-tree of LLVM, so the opam packages for llvm binding contain just opam file and a few patches. The patches are for making both static-linked libraries and dynamic-linked libraries.

edwin · February 22, 2023, 12:34am

That is very much appreciated.
However at the point of writing a binding you don’t yet know how it is going to be used. To be fully general a library may want to aim to support multicore, which means being careful about the things mentioned in the manual (the OCaml runtime lock now no longer protects global C data structures, so bindings should not use function-local static variables, or C globals, etc.), however that can be done as a 2nd step after fixing naked pointers.

I should point out that fixing the use of naked pointers needs to be done very carefully, I actually managed to introduce a race condition (in the sequential OCaml 4 mode too!) while doing that in Xen: each use of Abstract and Custom values now needs to be carefully inspected so that the dereference doesn’t happen with the runtime lock released. Previously that would’ve been fine since it was just pointer arithmetic, but now it is an actual dereference of an OCaml value that the GC may have moved.
Abstract tags are even more subtle: you cannot store the C pointer obtained from an abstract tag in a local variable, because the underlying OCaml value may move at any time (when the runtime lock is released), and that C pointer is just an offset into the Abstract tag, so it’ll end up pointing to a stale location.
Here are some concrete examples of the kind of bugs to avoid while removing naked pointers: [7/7] tools/ocaml/xc: Don't reference Custom objects with the GC lock released - Patchwork [6/7] tools/ocaml/xc: Don't reference Abstract_Tag objects with the GC lock released - Patchwork

Clearly I cannot trust myself to do these kind of modifications, so I started writing a small static analyzer to detect this particular bug
It is not yet ready for wider consumption, but you can see it in action here: add xapi-lintcstubs to CI, needed for static analyzer · edwintorok/xen-api@7d920c6 · GitHub (it is at the moment quite cumbersome to invoke involving several steps: CI: add static analyzer from XAPI to CI for OCaml C stubs · edwintorok/xen@6834637 · GitHub).
Once I got the analyser working on Xen+XAPI, the LLVM bindings seem like a good testcase to exercise it, although might take a while to get there. If it works then it could be added to a CI to avoid introducing bugs like these in the future.

As another concrete example of what might happen to OCaml bindings if they’re not continuously tested/paid attention to: if someone refactors some C function and introduces a new parameter, and they think they try to be helpful and update the OCaml binding to pass the new parameter, but forget to update the .ml file! [5/7] tools/ocaml/xc: Fix binding for xc_domain_assign_device() - Patchwork

This one can actually be detected quite easily at compile time (no need for fancy static analyzers or a CI), here is my attempt on how to do it for Xen (it requires a new enough OCaml that has compiler-libs): tools/ocaml: generate a .h file to check arity of OCaml C stubs · edwintorok/xen@29dea80 · GitHub. The approach could be adapted to work with LLVM, to at least catch bytecode function arity bugs at build time (though perhaps using migrate-parsetree would be a better idea than compiler-libs directly).
Native function arity is a bit more complicated to compute (with unboxed annotations/etc.) so I’ve only done it in the static analyzer for now, but if it is useful I could look into making a small arity checker that does both.

nojb · February 22, 2023, 7:15am

Not directly on topic, but I wonder if it is such a bad thing to move the LLVM bindings out of tree. I remember trying to fix a bug or two in the bindings in the past and I always found setting up the LLVM build quite cumbersome. Perhaps if the bindings lived in their own repository as a standard OCaml library, it would make it easier to contribute as well.

Cheers,
Nicolas

Topic		Replies	Views
My experience contributing to the LLVM bindings Community	5	2062	March 29, 2023
[ANN] LLVM 15 is out! Community	9	2595	September 17, 2023
Finding a maintainable, sustainable build system for the LLVM bindings + Dune currently doesn't meet the package's specific needs Ecosystem llvm , dune , ocamlfind	25	1590	October 2, 2023
LLVM: symbol not found for bytecode compilation Ecosystem llvm	22	1845	November 5, 2021
Clangml.4.0.0beta1: OCaml bindings for Clang API Community announce	7	1334	March 11, 2019

Proposal: care more about OCaml bindings for popular libraries

Related topics