[ANN] Building the OCaml Toolchain with Bazel - PoC

I’m inordinately pleased to announce that a Proof-of-Concept Bazel build of the OCaml compilers and tools (latest trunk version) is available for testing and exploration. Tested on MacOS 12.6 and Ubuntu 20.04.5 LTS. It is available at

This uses a stripped-down version of rules_ocaml, embedded in the repo in subdirectory bzl.

Currently the following targets build and run: ocamlc, ocamlopt.byte, ocamllex, all of the tools in the tools/ subdirectory.

Supports Zig as an alternate C toolchain and as a cross-compiler. I was also able to use the LLVM toolchain at GitHub - grailbio/bazel-toolchain: LLVM toolchain for bazel, but that was months ago and I have not tested it with this new version. But the code is in the WORKSPACE.bazel file, so getting it to work would be a good starter project.

Cross-compilation (using Zig) is supported for the C code. Full support is going to take some more work. For example, to compile a linux runtime on a mac:

$ bazel build boot/bin:ocamlrun --config=mac_linuxamd64

It uses some interesting Bazel features. For example, it uses platforms and toolchains to support the various compilers - instead of building ocamlc.byte or ocamlopt.byte, you build one target (boot/compiler) and pass CLI args telling Bazel which platform (vm or sys) should host the build compiler and which should be the target. Similarly, debug and instrumented variants are controlled by parameters rather than separate targets. It also supports fine-grained control of compile/link flags, so developers should be able to optimize for diffferent scenarios. You can tell it whether or not to compile .mli files separately. You can pass a custom primitives file for use with -use-prims and it will be used everywhere. Etc.

It’s far from polished but I think it works well enough for people to do a little testing and exploration.

Even if you have no interest in Bazel you may find the notes in the bzl/docs subdir worth looking at. I spent a lot of time studying the Makefiles and trying to understand the build, and took pretty extensive notes. They’re not very well organized but they have a lot of info, I think.

Feedback welcome. I’m not sure how far I’ll go with this, but I at least want to get complete cross-compilation working, and I’d also really like to get persistent workers going.

Enjoy!

Gregg

14 Likes

Congrats, that’s a really impressive achievement!

I’m very excited about these developments as it I’m hoping to use OCaml inside Google at some point. Last time I tried, not being able to build ocamlc in Bazel (Blaze) made it a bad idea to pursue this direction further.

it’s like a dream made reality :star_struck:

How is that C id able to produce more portable binaries than a C compiler?

Are you asking how Zig is able to do cross compiling? I don’t know much about it except that it’s distributed with a bunch of sysroots, clibs, etc. It uses some kind of devious trickery to shrink the stuff down to a reasonable size. It functions a a C compiler, I think it uses LLVM for that. The bazel-zig-cc toolchain packages it as a Bazel toolchain that makes it super easy to use in a Bazel project. The only thing I had to do to get cross-compilation working is create a configuration for the target platform and write some custom Bazel code to integrate it in the build. Which would have to be done for any cross-compilation setup. IOW Zig itself required effectively no work. Kind of amazing.

HTH

Gregg

Yep, that was more or less my question.
Since then I took a look at the zig compiler and now I understand it better. It claims to be able to compile C code, but better :smile: so it makes sense to use it for this purpose.

Is this bazel configuration intended to be used in ocaml projects or it is a way to cross-compile the ocaml tools itself? sorry if my question is dumb.

Only for building the compilers. The general rules_ocaml ruleset must support a lot of stuff that is not needed for building the compilers (e.g. ppx support), and the build protocol for bootstrapping compilers is a little more complicated than the standard build protocol. For that very reason (among others), compiler developers need a build program that they can understand and modify with reasonable effort. So I decided to make the OCaml boot rules as minimal and understandable as possible (I forgot to add “maintinability” in the Goals section of the readme.) I think eventually the Bazel rules (including the implementation code) will be pretty easy to understand; I’m currently cleaning up and refactoring the source code to get rid of all the cruft left over from development.

I think there is a very felicitous side-effect of all this, which is that the Bazel build program makes the build structure very clear and pretty simple, and Bazel features (like querying) make it very easy to explore and experiment. So much so that I can envision that it could be used in introductory OCaml material. Why not use the source code of the compiler itself to learn not only how to program in OCaml but how to organize and manage code?

To use a compiler built using these rules one could write some installation rules, or in a Bazel-based project use a toolchain that depends on it (rather than an OPAM-installed compiler, which is what rules_ocaml currently uses.)

I should add that the motivation for writing the compiler build rules in the first place was the Dogfooding Principle: any build system that can be used to build programs in a language should also suffice to build compilers. So I wanted to demonstrated that the Bazel rules could not only use the compilers but build them as well.

1 Like