Finding a maintainable, sustainable build system for the LLVM bindings + Dune currently doesn't meet the package's specific needs

As much as I despise Bazel (who writes build systems in Java!?), LLVM experiments with it for a while and Bazel files are in their monorepo. Thus it might it be a suitable choice for OCaml bindings too.

@alan thanks for the info. Unfortunately I misspoke: I’m seeing compile failures for three of the test files, not test failures. I was thinking maybe you were using a patched version of the test files; from your message it sounds like that is not the case. So I think my code must be afflicted by version skew, or maybe there’s a bug in my build logic. I’ll figure it out and let you know.

FWIW: At this point I’m not using the Bazel code in the LLVM source tree; instead, I’m using bazel-toolchain, which downloads precompiled binaries from the llvm-project. The downside of this (aside from sluggishness) is that the llvm releases do not always include binaries for the Mac. The most recent version with such binaries is 16.0.5. So in order to be generally useful either that tool or the in-tree Bazel code would have to be enhanced. But its good enough for Proof-of-Concept.

I have done a little experimenting with the bazel-toolchain. Bazel’s toolchain mechanism is very nice indeed, and can make life much easier for build writers. It turned out to be pretty easy to add some code to the toolchain that runs llvm-config, so that config settings can be expressed very simply in the user’s build program. For example, to depend on the C headers: deps = ["@llvm_toolchain_llvm//include:llvm-c", ...]. To link to the libs for the bitreader component add "@llvm_toolchain_llvm//lib:bitreader-libs", which adds the libs listed by llvm-config --libs bitreader. So the toolchain defines one such label for each component (llvm-config --components) and for completeness, one for each library (so you can depend on just @llvm_toolchain_llvm//lib:llvmLLVMBitreader, for example.).

To build for a particular target, e.g. AArch64, you pass --@llvm_config//target=AArch64 on the command line; the toolchain exposes this to build programs as "$(LLVM_TARGET_ARCH)", so you can do this: copts=["-DTARGET=$(LLVM_TARGET_ARCH)"], which will generate -DTARGET=AArch64.

All the toolchain stuff runs once, when Bazel configures the build environment, before actual builds kick off. So the info is already there when you build a specific target.

My code is at GitHub - obazl/ocaml-llvm: llvm bindings for OCaml. If you have questions it would probably be best to file an issue there, or ask on the OBazl discord.

Cheers,
Gregg

PS. Looking further into things, it looks like the bazel-toolchain I mentioned above is expressly designed for people who want to use the compiler(s) that come with llvm, rather than developers who want to use the llvm libraries and tools. The binary distros do not include FileCheck and llvm-lit, for example.

I looked at that for Xen too, but unfortunately that output has a lot of hardcoded paths, and has almost no conditional logic, so all the build time logic that would pick one file over another based on version of OCaml or presence/absence of packages wouldn’t work. I’ve attempted to post-process the output of dune rules -m, but I gave up on the idea in the end (it is better in my case to move the OCaml code out of Xen into its own repo where ‘dune’ can be used): https://github.com/edwintorok/xen/tree/xen-builds5/tools/ocaml/dune2makefile

Although in the case of Xen a hand-written shell (!) script that always builds all the files sequentially is order of magnitudes faster than the ‘maze of Makefiles’ and produces a correct output (whereas the Makefile often fails with inconsistent assumptions) and avoids all the problems with tools producing multiple files as output that are not well supported by make. In fact that can be encoded in a single make rule too, here is what could be used for Xen, may be possible to adapt to LLVM?

Sure you lose all the nice things like LSP editor integration, incremental builds (well the above Makefiles never had incremental builds either, you had to ‘make clean’ and ‘make’ to avoid all the inconsistent assumption errors all the time anyway, and there was a hardcoded limit to ‘-j1’ and a comment how OCaml is difficult to build), but you get something that works on other people’s systems without requiring them to install any OCaml specific packages (other than the compiler itself), and is still very fast in practice (in fact faster than the theoretically incremental Makefile even when that worked).

2 Likes

I haven’t tried, but also wonder how feasible it might be to adapt the build log from dune into a simple shell script. There is a very attractive simplicity to the shell script approach, I have used it on several occasions.

Regarding changing the upstream build to dune, I think that it is necessary to get a clear understanding of the testing consequences that such a change would entail. I do not know the full details, but my understanding is that most testing of LLVM is done post-commit, and that the existing development workflow does involve some commits being committed without going through a pull request. It might be sufficient for an ocaml.org-specific buildbot to be set up that would test all commits that touch the C or OCaml APIs. It would be important for this to be integrated into the existing LLVM system so that authors of commits receive notifications of breakages, in the expected way.

2 Likes

One issue is that the build log only shows external command invocations, and not actions that are performed by Dune internally, so this would need to be exposed as well. But I agree that being able to output a shell script to perform non-incremental builds independently of Dune would be useful. I think it is worth opening an issue over at GitHub - ocaml/dune: A composable build system for OCaml..

Cheers,
Nicolas

1 Like