OBazl Toolsuite - tools for building OCaml with Bazel

Version 2 of OBazl, a Bazel ruleset for building OCaml code, will soon be available. I’m letting you know early because I’ll be giving a presentation about the OBazl Toolsuite for the Bazel Exchange conference next Wed, 22 June, at 3:00 pm UDT (10:00 am CDT). It’s a virtual conference so you can tune in from anywhere. The talk will focus on some of the quirks of the OCaml build discipline and how I addressed them for the OBazl ruleset.

The tools are usable now, they’re just not yet properly documented and packaged, and in a few places there’s a little more work to be done on the code. Nonetheless there is quite a bit of documentation (CAVEAT: some of it is outdated), with more on the way soon, and there are lots of demos available. So if you’re interested in using Bazel to build your OCaml code I welcome you to take a look:

The OBazl Book

Twitter handle is @obazldev
Discord: https://discord.gg/PHSAW5DUva

Cheers,

Gregg

12 Likes

I can see how a general-purpose build system with wide language support is attractive. Though, we’re not shopping for a build system, since we’ve just migrated from OMake to dune, but it’s an interesting topic.

Does OBazl handle the -opaque flag, which allows not to rebuild dependencies when mli files did not change? For example, for our particular OMake build it was a challenge, and dune handles it transparently.

PS. The conference organizers have provided this discount token: BAZEL-GR-20

It should be good for 20% off, registration is at https://events.skillsmatter.com/bazelx2022

Yes, OBazl gives you pretty much complete control over that kind of stuff. I don’t have a test case where -opaque makes a difference, though; can you describe how you need to use it, or provide sample code? Is there a case where you want the build tool to infer that it is needed based on some criteria?

Regarding migration: in most cases it won’t be an either/or choice. OBazl and Dune can coexist, and before long I’ll have tooling to automatically convert dunefiles to bazel files. So I can imagine a situation where you write dunefiles for your OCaml code but use Bazel for all foreign code, and use OBazl to integrate. In other words, use Dune as a high-level language that you “transpile” to Starlark.

1 Like

I took a closer look at -opaque and realized that I had not fully understood its implications. So I added support for it. Also wrote up some documentation for it at https://obazl.github.io/docs_obazl/rules-ocaml/user-guide/optimization . Feedback welcome.

Thanks,

Gregg

2 Likes

Oh wow, I’ve been wishing for this (and not having the time to learn Bazel well enough to do it)) for a long, long, long, long time.

I’ll have to get it and try it!

1 Like

I’m really excited to see this!
I’m hoping to use OCaml at Google one day, this is big step towards making that feasible.

3 Likes

I wished I could have used OCaml at Google… Is it looking even remotely possible now?

Given that OMake 0.10.5 is incompatible with OCaml 5.0, I’m shopping for a new build tool for my Orsetto project. I use Bazel in my day job, so this is an attractive candidate. Any chance we might see a conf-bazel package added to OPAM so a package can depend on a compatible version of Bazel being installed on the host?

I’ve put a lot of work into seamless OPAM integration, but only in one direction: make it easy to use OPAM resources in a Bazel build program. I have not put much thought into integrating Bazel itself into the OPAM ecosystem. For example publishing a Bazel-enabled package to OPAM. It looks like writing such a conf-bazel package would be pretty easy, but I’m not sure it would do us much good at the moment. What specific use cases do you have in mind?

There are two ways to integrate Bazel and OPAM. One is to automatically generate BUILD.bazel files for OPAM packages. Then Bazel would build everything, eliminating the need for the OPAM engine. This is the strategy followed by rust (tool: cargo_raze, evidently now supplanted by crate_universe) and go (tool: gazelle). Unfortunately a complete solution along these lines is not feasible for OCaml, since source files do not carry enough information to support inference to a build program, and OPAM packages may use a variety of build languages (Dune, Makefiles, OMake, etc.). On the other hand, Dune seems to be the most widely used build engine by a considerable margin, and the Dune language is easy to parse (if not so easy to interpret), so I’m working on a conversion tool that automatically converts Dune files to BUILD.bazel files.

The other strategy is to rely on OPAM to build dependencies and then “import” the built artifacts into Bazel. OBazl defines an opam_import rule for this purpose, and a tool that bazelizes OPAM switches, generating an OBazl ‘coswitch’. The mapping from OPAM package name to Bazel label is straightforward: ‘yojson’ to @yojson//lib/yojson, ‘lwt.unix’ to '@lwt//lib/unix`, etc.

So in practice OBazl supports a hybrid approach. Use Bazel to build your code, but import pre-built OPAM dependencies. To do that you run the opam conversion tool to generate a ‘coswitch’ which defines a local Bazel repo for each OPAM package, and configure your WORKSPACE.bazel to import those repos. Write your BUILD.bazel files using opam labels as above. If your project already uses dune, you can run the dune conversion tool to generate your BUILD.bazel files, which in some cases will need some tweaking, since some Dune stanzas lack sufficient information for conversion, and in others the conversion code needs complicated logic that I haven’t gotten around to writing, or that does not seem worth the bother.

The OPAM “import” conversion tool is fairly stable. It converts the META files in OPAM into BUILD.bazel files, which include dependency information. So when you depend on an opam_import target you get its entire dependency graph.

The Dune migration tool is another matter. Reverse-engineering the Dune language is a non-trivial task, lemme tell ya. The good news is that after what seems like eons of work the end is in sight. I’ve been running it against a semi-random set of projects (js_of_ocaml, ocaml-protoc, some ppx libs, etc.) and working through the quirks inch-by-inch. Rule stanzas are a real PITA, can I just say that? In any case, it looks like I should have an alpha release with documentation and some case studies within a week or so. I hope. At the very least I’ll convert my dev configuration into something usable by others so you can follow along if you want.

Cheers,

Gregg

3 Likes

Apologies for not responding sooner. “Remotely possible”? Of course. Plausible? I begin to think so. The rules seem to work pretty well. I’ve been bazeling OPAM packages, and occasionally I’ve come across some scenario I had not foreseen, but in every case it’s been pretty easy to address the problem. In some cases what’s complicated in Dune becomes pretty simple in OBazl. Sometimes it’s a matter of just knowing how to use the resources Bazel makes available. For example, ocaml_module emits a .cmi file and a .cmo (or .cmx and .a) file. There is no way to ask for just the .cmi file or just the .cmo file, but some tools need this. But the rule also supports --output_groups=cmi on the command line, which can be used in a filegroup rule to extract and deliver the .cmi file. So you’d have your tool depend on a a filelgroup rule that depends on an ocaml_module rule. All of which could be encapsulated in a macro.

So in short, I think the major inhibitors now are tooling and documentation. Tools are on the way, along with a lot of examples demonstrating the stuff that will eventually be documented in a User Guide.

HTH,

Gregg

2 Likes

Before I answer this in full, I should emphasize here up front how very grateful I am that you’ve taken up this challenging task, and it would seem you’ve made a lot of progress already. You’ll see why this feels so important to me after I’ve explained my thinking further.

p1. I’m the author of Conjury, an alternative library for the OMake build tool.

That project started as a system of Perl libraries that I carted around in all my personal projects for generating cross-platform Makefile instances (because I absolutely could not stand GNU Autoconf, and I wasn’t allowed to use it in some contexts because GPL was forbidden without paying lawyers to get permission for it).

Some time later (but still well over a decade ago) I threw away all that maddening Perl and redesigned the whole system around OMake, which I much preferred. (This was well before JBuilder, which has begotten Dune. This was also well before Google decided to release Blaze into the world as Bazel.)

Since then, I have used Conjury in a whole raft of personal unreleased projects, while OMake has perhaps not been as well-loved by the community as I was hoping it would be when I adopted it. Recently, I’ve been finding it almost as maddening as that pile of Perl I threw away, and the friction is prompting me to search for an alternative.

I’m still not a big fan of Dune, because most of my personal projects are multi-language, with a lot of C++ and C and others in the mix, and the GNU Autoconf tool suite remains a non-starter for me for all the old reasons that remain unchanged. Bazel has a lot to offer me, it does basically everything that originally attracted me to OMake, and I’m trying to get my head around how I can rewrite all the build logic in all my personal projects to use it instead of Conjury.

p2. I’m in a position to influence choices of alternative programming language at my day job.

It’s a reasonably large organization with hundreds of engineers in the software department, and over a dozen people whose full-time job is just wrangling Bazel to build and integrate an insanely large and complex monorepo full of several major systems languages, along with a whole raft of special purpose programming language including some bespoke DSLs.

With the impending release of OCaml 5.0, it’s just starting to be possible to speak seriously about it as a candidate for inclusion in our software ecosystem. It has some attractive qualities for our business case. Until recently the lack of a high quality Bazel rule set for OCaml has made the challenge of demonstrating the applicability of OCaml internally to my colleagues a pretty daunting challenge. (One reason Rust hasn’t taken off at my day job is that the integration with Bazel is considered unacceptable by our dev infra team.)

Summary: The OBazl Toolsuite looks like it has a pretty good chance of sticking the landing on point two: helping me bring OCaml into my day job. It’s not there yet (documentation of the rule set is, understandbly, lagging a bit behind the implementation), but I feel like it will probably get there in reasonable time. On the other hand, my search for a replacement for OMake remains an open problem for me. Only some of my personal projects are packaged with OPAM, and only a fraction of those are released publicly. (I have a private OPAM repository as well as a publicly visible one where I publish development branches for my packages on opam.ocaml.org.) I need to find a tool that can deal well with multiple programming languages, not just OCaml (so that leaves out Dune), and also produce artifacts that can be delivered as OPAM packages (even if they’re written in other languages, not OCaml). I was hoping that I might turn to Bazel for that, but it looks like a daunting challenge.

1 Like

Doesn’t dune get advertised as being able to handle multiple programming languages, including C/C++? There seems to be a whole section on it in the docs: Dealing with Foreign Libraries — dune documentation

Can you remember any of the blockers that prevent using it?

Thanks! Nice to get an attaboy after all these months (I started two(!) years ago).

What sticks in my craw wrt Dune is its opacity. That’s largely what motivated me in the first place. There’s a substantial mismatch between the conceptual structure of the Dune language and the actual build protocol of the OCaml toolset. “Virtual” modules? Lack of an actual build programming language is also a problem from my perspective.

On the bright side, progress has been steady if not evident to anybody but me, hehe. I suppose I should tweet status updates more often.

That is indeed one of my goals, but my first priorities are to make sure the rules are solid and to finish what I consider to be the essential tools needed to ease adoption and migration. Mainly tools that will ease the production and maintenance of BUILD.bazel files, which can look pretty verbose, especially compared to Dune stanzas. Which implies tools that automate OPAM integration. The good news is that that’s pretty close to done, or done enough. The tools work on the test cases I’ve written; I’m now in the process of finding and fixing bugs by running them on real projects.

I guess I should mention that the Dune conversion tool is written C but exposes a Scheme API, which means users who know Scheme should be able to customize it or even write their own tools for processing dune files with relative ease.

Oh yeah, I’ve also got a tool, also Scheme-based, that supports scriptable editing of BUILD.bazel files. More details on all this to follow soonish.

Regarding OPAM packaging and deployment, I’m pretty confident that there’s a good solution to be found. Currently to integrate OPAM one runs $ bazel run @opam//shared:refresh, which writes the Bazel stuff to $XDG_DATA_HOME/obazl/opam (with symlinks to ~/.opam/). But I believe it could write those files directly into the opam switch without doing any harm. So I’m thinking that the tool that does the work could be adapted to run under OPAM, using the post-install-commands field of the .opam file. It would run after OPAM has finished the build and installation, generating bazel files in the opam switch. Once bazelization of OPAM packages is automated, running Bazel as a build tool under OPAM should work. Then deploying an OBazl project as an OPAM package would be a matter of generating an .opam file, which would probably not be too difficult. Writing a conf_bazel file looks fairly easy as well.

Thanks again for the very helpful feedback.

Gregg

1 Like

There’s really no comparison. Dune evidently can use the (C ABI) outputs of a “foreign” build (if you write the glue code needed to make this work) but there’s no real build integration, and no hermeticity guarantees. Under Bazel different languages use different rulesets but they’re all Bazel rulesets, so you get one dependency graph across all languages, and if the rulesets are hermetic you get a hermetic build. Without ABI restrictions. For example if your build needs to run a Python (or Javascript, Ruby, whatever) tool, Bazel will build the tool and run it for you.

Even for C I think Bazel has much better integration. The rules in rules_cc (e.g. cc_library producing a .a file) deliver a CcInfo provider (a provider is a kind of struct whose fields contain the artifacts delivered by a build action). The rules in rules_ocaml (e.g. ocaml_module) understand CcInfo dependencies and pass them around using OcamlProvider (a provider specific to the ocaml rules). Bazel supports a merge operation for CcInfo, and the ocaml rules always merge their CcInfo deps and pass them on. So every build target delivers the merge of all its CcInfo deps. The ocaml_binary rule that links its dependencies into an executable merges its CcInfo deps (which include merged CcInfo from their deps, recursively) and ends up with a single CcInfo containing every cc dependency in the dep graph, in the right order, with no duplicates. Then its simply a matter of constructing the link command with the appropriate --ccopt options. More succinctly: you can add a C dep directly to the module that needs it, and Bazel it pass it up the dependency chain, ensuring that it ends up on the command line when needed - building archives or executables. You don’t need to add a C dep to an archive target when only one of n modules in the archive actually depends on it.

I’ve just started working on rules_jsoo, which I think will nicely demonstrate the virtues of Bazel integration. The Bazel ecosystem includes a bunch of tools for working with Javascript; for example rules_js and rules_nodejs make it easy to control which node toolchain version to use, integrate npm stuff, etc. Wouldn’t it be nice to be able to use such tools directly, without writing a bunch of glue code? Now a key element of Bazel integration is the use of providers. Rules deliver providers, and since providers act as a kind of rudimentary type system, I can use the JsInfo provider (defined by rules_js) to integrate rules_jsoo with the larger Bazel js ecoystem. For example, the jsoo_library rule takes the OcamlProvider provider delivered by ocaml_module rules, which contains the .cmo file. So jsoo_library runs those .cmo files through the jsoo compiler and delivers the resulting js files in a JsInfo provider. That provider is suitable as input to the rules in rules_js, which gives us seamless integration. So we can use the js_binary rule of rules_js to run code produced by jsoo_library under node. All that’s needed is to list the latter as a dependency of the former. That’s the plan, anyway. Isn’t that nice?

Cheers,

Gregg

2 Likes

Does it handle C++ packaging systems? Does it handle C++ testing frameworks? Static analysis and code coverage checks? I can just download rule sets for that in Bazel now. Sure, I gotta write all that myself in OMake, and I have written some of it, and it’s in Conjury 2.1. Last I looked, the support for C++ in Dune is mainly about using C++ libraries from OCaml projects, and not about being a first class tool for C++ projects. That’s why I’m not looking at Dune for my personal projects.

1 Like

Thanks both for the detailed explanations. This confirms for me that people with even moderately complex project setups (polyglot setups) are not really benefitting from dune, while people with simple setups are paying the cost of dune’s complexity.

2 Likes

I see no confirmation for the second point. A simple project in pure OCaml will benefit tremendously from dune. I don’t know of any other tool that comes remotely close to dune for pure OCaml projects, be they small or large.

7 Likes

I see no confirmation for the second point.

I think I basically agree with that. Dune has some complexity to it for the simple pure OCaml project case that is forced upon it by some characteristics of the compiler tool chain that I consider to be longstanding design errors that are difficult to address. I’m thinking primarily of the gap that was originally filled by ocamlfind and which has since sort of metastasized into a carbuncle of unnecessary complexity that is difficult to excise, given that a proper simplification would probably require lifting library names directly into the syntax of the language so the compiler is aware of them.

I would characterize Dune as the best available tool for simple pure OCaml projects, but it should not be compared to Bazel. The idea of making Dune and OPAM into a replacement for what my day job is using Bazel to do is just not to be taken seriously. Dune is not sufficiently composable, not even really close to it. And OPAM isn’t suited for large-scale monorepo projects; it was not designed for that purpose.

The next time I start an OCaml project (which may be 29 Sept 2022) I will start using OBazel. And everything you write about Dune is true, and it’s why I am implacable in my opposition to it, using autoconf+make for all my projects instead.

Dune is opaque and OCaml-focused; I’ve written projects with OCaml + C++ + Python + Java + Golang. No way is Dune enough.