OCamlCC: Bazel-enabled OCaml Toolchain (new alpha release)

Hi folks. A new alpha version of OCamlCC, the Bazel-enabled version of the OCaml repository, is now available.

This version includes many major changes and improvements. Highlights:

  • All the compilers are buildable under the usual names: ocamlc.byte,
    ocamlopt.byte, etc. This includes the flambda compilers, whose names
    use “optx” instead of “opt” (e.g. ocamloptx.optx), and the profiling
    compilers ocamlcp.byte, etc.

  • Test support. The makefiles use a custom tool, ocamltest, to run
    tests. This tool is essentially a mini build engine. Since we
    already have an excellent build engine in Bazel, we do not need
    ocamltest - all of its functionality is provided by Bazel and a few
    custom Bazel test rules. A small subset of the tests in testsuite
    (about 50) have been converted to use Bazel. Bazel’s testing
    capabilities are very powerful and flexible; for example it is easy
    to run any test individually, to create custom test suites, and to
    select tests to run based on tags - e.g run only of the inline
    expect tests in some set of test directories, or only tests
    involving integers.

  • The tools (ocamldep, ocamlobjinfo) can be built and run under
    Bazel’s control, which means arguments to the tools can be expressed
    as Bazel target labels. This saves the user the problem of finding
    filesytem paths for the arguments; e.g. you can pass
    //bytecomp:Bytegen to the ocamlcmt runner, which will
    automatically configure it to be built with -bin-annot, and will
    find the correct path for the resulting .cmt file and pass it to the
    tool. Build labels can also be used with ocamldep; the runner will
    find the source corresponding file and pass it to the tool.

  • Dependencies are fine-grained. For example, targets that depend on
    submodules of the standard library depend on them directly; they do
    not depend on the stdlib archive file.

  • In general, signatures (.mli files) depend only on other signatures,
    and with a few exceptions “modules” (.ml files) depend only on other
    modules. All of these dependencies are explicitly expressed, and
    they are quasi-typed: if you list a module as a dependency of a
    signature, Bazel will complain.

  • By default, compiler builds do not use archived libraries (e.g.
    stdlib.cmx?a); this can be configured by a command-line switch.

This version also includes a lot of internal features relevant to
maintainers. For example much of the logic involving configuration and
preprocessing has been reworked to use mustache templates. This is not
strictly-speaking a Bazel thing, it’s more of an experiment in
improving and simplifying this logic. It also has implications for
portability - the mustache tooling is written in portable C, and it
replaces non-portable scripts.

I’ve added quite a bit of documentation, although as you might expect
it is neither complete nor polished. I’ve included a good deal of
information about how Bazel does certain things.

I’ve tested it on Mac an Linux. It does not require any special
configuration, so it should be pretty easy to get started with it.

If you need help: I monitor this list and the OCaml discord server, and I’ve set up an OBazl discord server. You can also file an issue.

PRs are welcome, but since this is still in flux, please file an issue
first to make sure your idea is still relevant and needed.

Cheers,

Gregg

9 Likes

Is there sufficient reason to set up an entire OBazl discord server? I’d be happy to supply you with a channel on the main discord server.

This is a contradiction: if you cover only a handful of the simplest tests handled by ocamltest, you cannot affirm that “we do not need ocamltest”.

I am not sure what you meant, but that sounds erroneous: module implementations depend on the signatures of their dependencies and not on the implementation of other modules?

That also sounds problematic, targets that depend on the standard library should depend on the stdlib archive rather than on the individual modules?

Not really. I set it up a long time ago, mainly just because I could.

A #buildsystem channel might be useful. I reckon people might still have questions about makefiles and other build systems in use. I don’t expect a whole lot of OBazl traffic.

Trust me. I only converted small number of tests because there are so many test cases and its tedious work. Plus we only need one case per test type to demonstrate a test rule. I’ve got emacs code to parse and analyse the test DSL at the head of the test cases, but I do not plan to go further with that unless there is demand for it. If you do know of some tricky test cases please file an issue and I’ll write the Bazel code for them.

My bad. I hope this is clearer: module targets depend on a source (.ml) file, usually depend on a sig target (which produces a .cmi file), and may depend on other module targets. In a few rare cases a module target may depend on a sig target (a .cmi file without a corresponding .cmo/.cmx file), but sig targets only depend on other sig targets. Since each module target depends directly on its (own) sig target, by transitivity when one module target depends on another it thereby depends (indirectly) on the latter’s sig target. The Bazel rules manage all these dependencies automatically. (The lesson here: for clarity it is necessary to maintain a distinction between modules and module build targets.)

Either way works.The actual dependency is always on a module - “archive” is not even a thing in the language, its just an implementation detail. This is largely a matter of expressivity - the Bazel rules prioritize maximal expressiveness, high-resolution, and explicitness, so they allow (but do not require) that dependency on a submodule in a namespace can be directly expressed. For example, bytecomp/meta.ml depends only on Stdlib.Obj; it does not depend on everything in the stdlib archive (and so should not be rebuilt if some other submodule is changed). The Bazel rules allow us to say so explicitly.

Bear in mind that one of the goals here is to explore the design space. I don’t know if all the features I’ve enabled are optimal, but they work so we can explore the implications.

I don’t see any reason to trust those kind of assertion without any evidences. Converting a handful of tests is not really impressive. The old Makefile based testsuite worked too, for some definition of worked.

Typically, there is an unbounded number of test type, so this approach cannot work for the generic case. What would be meaningful would be a proof that any ocamltest test can be converted to your settings and that this conversion results in tests as easy to write and extend as in the current settings. Lacking a proof, converting all tests would give me some moderate level of confidence in your affirmation.

Without those, an assertion like “we do need ocamltest” is erroneous, in the best case.

At the top of my mind, the missing cmis tests and the dynlinking files are good example of the moderately complex files that I had to write.

In other words, you make the archive an transparent implementation details, that does sound interesting, thanks for the explanation!

Likely dumb question, how are the dependencies in .bazel files updated?

It’s an excellent question. Everything in OCamlCC is hand-rolled. Well, the first cut was programmatically generated, but I’ve since done tons of hand-tuning as I explored various design options. Upstream deps changes are sometime slightly annoying, but no more; for example, if a module is added the build tells you what’s missing and its pretty easy to figure out what to add. I occasionally use ocamldep to see if I’ve got any spurious deps. But obviously hand-editing is not so good for the general case.

I’ve got several tools in the works to address this and related issues (e.g. automatic conversion of dune files). On the bright side, since deps in the rules are transitive, each target only needs to list its direct deps, which are easy to obtain. Mapping module/sig deps to libraries is the hard part, which I gather is a Generally Recognized Problem.

Also note that since all inputs must be explicitly listed before the build starts, the complete dependency tree is already encoded in the BUILD.bazel files, so nothing like ocamldep runs as part of the build. Some people won’t like that I suppose, but I think its the right approach. Deps don’t change much.

Then there’s programmatic editing of the BUILD.bazel files. It rather pains me to say I’ve been sitting on Sunlark for over a year. I think it has the makings of a pretty powerful editing tool for Bazel files, which would make updating the deps attributes easy (once you’ve discovered what they should be)

HTH,

Gregg

We already have dune and opam channels, so adding #buildsystem at this point will probably just add confusion. We can always play it by ear - if you see demand growing for an #obazl channel, just let me know and I’ll happily open one under the Ecosystem category.

1 Like