OCamlCC: Bazel-enabled OCaml Toolchain (new alpha release)

mobileink · January 10, 2023, 3:11am

Hi folks. A new alpha version of OCamlCC, the Bazel-enabled version of the OCaml repository, is now available.

This version includes many major changes and improvements. Highlights:

All the compilers are buildable under the usual names: ocamlc.byte,
ocamlopt.byte, etc. This includes the flambda compilers, whose names
use “optx” instead of “opt” (e.g. ocamloptx.optx), and the profiling
compilers ocamlcp.byte, etc.
Test support. The makefiles use a custom tool, ocamltest, to run
tests. This tool is essentially a mini build engine. Since we
already have an excellent build engine in Bazel, we do not need
ocamltest - all of its functionality is provided by Bazel and a few
custom Bazel test rules. A small subset of the tests in testsuite
(about 50) have been converted to use Bazel. Bazel’s testing
capabilities are very powerful and flexible; for example it is easy
to run any test individually, to create custom test suites, and to
select tests to run based on tags - e.g run only of the inline
expect tests in some set of test directories, or only tests
involving integers.
The tools (ocamldep, ocamlobjinfo) can be built and run under
Bazel’s control, which means arguments to the tools can be expressed
as Bazel target labels. This saves the user the problem of finding
filesytem paths for the arguments; e.g. you can pass
//bytecomp:Bytegen to the ocamlcmt runner, which will
automatically configure it to be built with -bin-annot, and will
find the correct path for the resulting .cmt file and pass it to the
tool. Build labels can also be used with ocamldep; the runner will
find the source corresponding file and pass it to the tool.
Dependencies are fine-grained. For example, targets that depend on
submodules of the standard library depend on them directly; they do
not depend on the stdlib archive file.
In general, signatures (.mli files) depend only on other signatures,
and with a few exceptions “modules” (.ml files) depend only on other
modules. All of these dependencies are explicitly expressed, and
they are quasi-typed: if you list a module as a dependency of a
signature, Bazel will complain.
By default, compiler builds do not use archived libraries (e.g.
stdlib.cmx?a); this can be configured by a command-line switch.

This version also includes a lot of internal features relevant to
maintainers. For example much of the logic involving configuration and
preprocessing has been reworked to use mustache templates. This is not
strictly-speaking a Bazel thing, it’s more of an experiment in
improving and simplifying this logic. It also has implications for
portability - the mustache tooling is written in portable C, and it
replaces non-portable scripts.

I’ve added quite a bit of documentation, although as you might expect
it is neither complete nor polished. I’ve included a good deal of
information about how Bazel does certain things.

I’ve tested it on Mac an Linux. It does not require any special
configuration, so it should be pretty easy to get started with it.

If you need help: I monitor this list and the OCaml discord server, and I’ve set up an OBazl discord server. You can also file an issue.

PRs are welcome, but since this is still in flux, please file an issue
first to make sure your idea is still relevant and needed.

Cheers,

Gregg

bluddy · January 10, 2023, 7:19am

Is there sufficient reason to set up an entire OBazl discord server? I’d be happy to supply you with a channel on the main discord server.

octachron · January 10, 2023, 8:22am

This is a contradiction: if you cover only a handful of the simplest tests handled by ocamltest, you cannot affirm that “we do not need ocamltest”.

I am not sure what you meant, but that sounds erroneous: module implementations depend on the signatures of their dependencies and not on the implementation of other modules?

That also sounds problematic, targets that depend on the standard library should depend on the stdlib archive rather than on the individual modules?

mobileink · January 10, 2023, 2:47pm

Not really. I set it up a long time ago, mainly just because I could.

A #buildsystem channel might be useful. I reckon people might still have questions about makefiles and other build systems in use. I don’t expect a whole lot of OBazl traffic.

mobileink · January 10, 2023, 3:14pm

Trust me. I only converted small number of tests because there are so many test cases and its tedious work. Plus we only need one case per test type to demonstrate a test rule. I’ve got emacs code to parse and analyse the test DSL at the head of the test cases, but I do not plan to go further with that unless there is demand for it. If you do know of some tricky test cases please file an issue and I’ll write the Bazel code for them.

My bad. I hope this is clearer: module targets depend on a source (.ml) file, usually depend on a sig target (which produces a .cmi file), and may depend on other module targets. In a few rare cases a module target may depend on a sig target (a .cmi file without a corresponding .cmo/.cmx file), but sig targets only depend on other sig targets. Since each module target depends directly on its (own) sig target, by transitivity when one module target depends on another it thereby depends (indirectly) on the latter’s sig target. The Bazel rules manage all these dependencies automatically. (The lesson here: for clarity it is necessary to maintain a distinction between modules and module build targets.)

Either way works.The actual dependency is always on a module - “archive” is not even a thing in the language, its just an implementation detail. This is largely a matter of expressivity - the Bazel rules prioritize maximal expressiveness, high-resolution, and explicitness, so they allow (but do not require) that dependency on a submodule in a namespace can be directly expressed. For example, bytecomp/meta.ml depends only on Stdlib.Obj; it does not depend on everything in the stdlib archive (and so should not be rebuilt if some other submodule is changed). The Bazel rules allow us to say so explicitly.

Bear in mind that one of the goals here is to explore the design space. I don’t know if all the features I’ve enabled are optimal, but they work so we can explore the implications.

octachron · January 10, 2023, 3:51pm

I don’t see any reason to trust those kind of assertion without any evidences. Converting a handful of tests is not really impressive. The old Makefile based testsuite worked too, for some definition of worked.

Typically, there is an unbounded number of test type, so this approach cannot work for the generic case. What would be meaningful would be a proof that any ocamltest test can be converted to your settings and that this conversion results in tests as easy to write and extend as in the current settings. Lacking a proof, converting all tests would give me some moderate level of confidence in your affirmation.

Without those, an assertion like “we do need ocamltest” is erroneous, in the best case.

At the top of my mind, the missing cmis tests and the dynlinking files are good example of the moderately complex files that I had to write.

In other words, you make the archive an transparent implementation details, that does sound interesting, thanks for the explanation!

ejgallego · January 11, 2023, 11:36pm

Likely dumb question, how are the dependencies in .bazel files updated?

mobileink · January 12, 2023, 12:38am

It’s an excellent question. Everything in OCamlCC is hand-rolled. Well, the first cut was programmatically generated, but I’ve since done tons of hand-tuning as I explored various design options. Upstream deps changes are sometime slightly annoying, but no more; for example, if a module is added the build tells you what’s missing and its pretty easy to figure out what to add. I occasionally use ocamldep to see if I’ve got any spurious deps. But obviously hand-editing is not so good for the general case.

I’ve got several tools in the works to address this and related issues (e.g. automatic conversion of dune files). On the bright side, since deps in the rules are transitive, each target only needs to list its direct deps, which are easy to obtain. Mapping module/sig deps to libraries is the hard part, which I gather is a Generally Recognized Problem.

Also note that since all inputs must be explicitly listed before the build starts, the complete dependency tree is already encoded in the BUILD.bazel files, so nothing like ocamldep runs as part of the build. Some people won’t like that I suppose, but I think its the right approach. Deps don’t change much.

Then there’s programmatic editing of the BUILD.bazel files. It rather pains me to say I’ve been sitting on Sunlark for over a year. I think it has the makings of a pretty powerful editing tool for Bazel files, which would make updating the deps attributes easy (once you’ve discovered what they should be)

HTH,

Gregg

bluddy · January 12, 2023, 7:15am

We already have dune and opam channels, so adding #buildsystem at this point will probably just add confusion. We can always play it by ear - if you see demand growing for an #obazl channel, just let me know and I’ll happily open one under the Ecosystem category.

Topic		Replies	Views
[ANN] Building the OCaml Toolchain with Bazel - PoC Ecosystem announce , cross-compilation , bazel , toolchains	7	1126	November 10, 2022
OBazl 2.0.0-alpha-1 (Building OCaml SW with Bazel) Ecosystem build , bazel	0	633	July 7, 2022
OBazl Toolsuite - tools for building OCaml with Bazel Ecosystem build , bazel	52	3897	March 31, 2025
[ANN] The OBazl Toolsuite 3.0.0.beta.1 Ecosystem build , bazel	9	258	April 2, 2025
Status update: Bazel-enabled OCaml toolchain Ecosystem build , bazel , toolchain	8	919	December 2, 2022

OCamlCC: Bazel-enabled OCaml Toolchain (new alpha release)

Related topics