My experience contributing to the LLVM bindings

Over the past few months, I’ve been contributing patches to the LLVM bindings. In particular, I ported the bindings to OCaml 5, which involved removing naked pointers. I wrote about what I learned here.


Thanks for your work on the LLVM bindings, I am using them in my compiler project.

Question: Given all the changes in LLVM 15 (like untyped pointers), do you know if its OCaml bindings are in good shape? I’m wondering if I should try to build them or just wait for the OPAM package.

The OCaml bindings are tied to a specific version of LLVM. The latest version on Opam is LLVM 14. LLVM 15 is not yet on Opam; this is the GitHub PR. LLVM 16 will be released soon; my changes adding opaque pointer functions made it in the LLVM 16 branch, but my changes for OCaml 5 were too recent. OCaml 5 support should come with the LLVM 17 release.


Do you have a minimal working example?

This is a great writeup, thanks!

To test the code with and without naked pointers, I had to switch between OCaml 4.14 and OCaml 5.

This should not be necessary. Recent versions of OCaml 4 offer a “naked pointer checker” that will create alarms if it finds naked pointers, which I think suffices to port programs to the no-naked-pointer mode without using OCaml 5. (So: the program should behave as standard OCaml 4 releases that allow naked pointers, but print warning messages when it actually encounters them at runtime.)

The naked pointer checker (nnpchecker) was introduced in [ANN] A dynamic checker for detecting naked pointers , and can be configure in OCaml 4.12 or later 4.x releases by selecting the opam compiler option ocaml-option-nnpchecker.

Maintaining the bindings require an understanding of the OCaml runtime to a level that currently isn’t reflected in documentation, but instead relies on institutional knowledge imparted by past contributors.

Idle question: have you considered reverting to the “naive” approach that follows the manual guidelines to the letter, adding more macros in case where they might be deemed unnecessary with low-level reasoning? Is the worry that doing this would have a noticeable performance cost? This could make maintenance easier in the future by reducing the dependency on runtime knowledge.

I tried to use a switch with the nnpchecker option active but somehow got a segfault, and I didn’t bother investigating more deeply.

In the beginning, I added the CAMLparam macros, but in the interest of keeping the diff small and easy to review, and focused on making one change (removing naked pointers), I was told to remove them. Whether the bindings should just use CAMLparam is a matter for further discussion, but I was told that they do overhead such as checking the thread-local runtime state and increasing the number of roots to traverse, etc.