OBazl Toolsuite - tools for building OCaml with Bazel

I’ve put a lot of work into seamless OPAM integration, but only in one direction: make it easy to use OPAM resources in a Bazel build program. I have not put much thought into integrating Bazel itself into the OPAM ecosystem. For example publishing a Bazel-enabled package to OPAM. It looks like writing such a conf-bazel package would be pretty easy, but I’m not sure it would do us much good at the moment. What specific use cases do you have in mind?

There are two ways to integrate Bazel and OPAM. One is to automatically generate BUILD.bazel files for OPAM packages. Then Bazel would build everything, eliminating the need for the OPAM engine. This is the strategy followed by rust (tool: cargo_raze, evidently now supplanted by crate_universe) and go (tool: gazelle). Unfortunately a complete solution along these lines is not feasible for OCaml, since source files do not carry enough information to support inference to a build program, and OPAM packages may use a variety of build languages (Dune, Makefiles, OMake, etc.). On the other hand, Dune seems to be the most widely used build engine by a considerable margin, and the Dune language is easy to parse (if not so easy to interpret), so I’m working on a conversion tool that automatically converts Dune files to BUILD.bazel files.

The other strategy is to rely on OPAM to build dependencies and then “import” the built artifacts into Bazel. OBazl defines an opam_import rule for this purpose, and a tool that bazelizes OPAM switches, generating an OBazl ‘coswitch’. The mapping from OPAM package name to Bazel label is straightforward: ‘yojson’ to @yojson//lib/yojson, ‘lwt.unix’ to '@lwt//lib/unix`, etc.

So in practice OBazl supports a hybrid approach. Use Bazel to build your code, but import pre-built OPAM dependencies. To do that you run the opam conversion tool to generate a ‘coswitch’ which defines a local Bazel repo for each OPAM package, and configure your WORKSPACE.bazel to import those repos. Write your BUILD.bazel files using opam labels as above. If your project already uses dune, you can run the dune conversion tool to generate your BUILD.bazel files, which in some cases will need some tweaking, since some Dune stanzas lack sufficient information for conversion, and in others the conversion code needs complicated logic that I haven’t gotten around to writing, or that does not seem worth the bother.

The OPAM “import” conversion tool is fairly stable. It converts the META files in OPAM into BUILD.bazel files, which include dependency information. So when you depend on an opam_import target you get its entire dependency graph.

The Dune migration tool is another matter. Reverse-engineering the Dune language is a non-trivial task, lemme tell ya. The good news is that after what seems like eons of work the end is in sight. I’ve been running it against a semi-random set of projects (js_of_ocaml, ocaml-protoc, some ppx libs, etc.) and working through the quirks inch-by-inch. Rule stanzas are a real PITA, can I just say that? In any case, it looks like I should have an alpha release with documentation and some case studies within a week or so. I hope. At the very least I’ll convert my dev configuration into something usable by others so you can follow along if you want.




Apologies for not responding sooner. “Remotely possible”? Of course. Plausible? I begin to think so. The rules seem to work pretty well. I’ve been bazeling OPAM packages, and occasionally I’ve come across some scenario I had not foreseen, but in every case it’s been pretty easy to address the problem. In some cases what’s complicated in Dune becomes pretty simple in OBazl. Sometimes it’s a matter of just knowing how to use the resources Bazel makes available. For example, ocaml_module emits a .cmi file and a .cmo (or .cmx and .a) file. There is no way to ask for just the .cmi file or just the .cmo file, but some tools need this. But the rule also supports --output_groups=cmi on the command line, which can be used in a filegroup rule to extract and deliver the .cmi file. So you’d have your tool depend on a a filelgroup rule that depends on an ocaml_module rule. All of which could be encapsulated in a macro.

So in short, I think the major inhibitors now are tooling and documentation. Tools are on the way, along with a lot of examples demonstrating the stuff that will eventually be documented in a User Guide.




Before I answer this in full, I should emphasize here up front how very grateful I am that you’ve taken up this challenging task, and it would seem you’ve made a lot of progress already. You’ll see why this feels so important to me after I’ve explained my thinking further.

p1. I’m the author of Conjury, an alternative library for the OMake build tool.

That project started as a system of Perl libraries that I carted around in all my personal projects for generating cross-platform Makefile instances (because I absolutely could not stand GNU Autoconf, and I wasn’t allowed to use it in some contexts because GPL was forbidden without paying lawyers to get permission for it).

Some time later (but still well over a decade ago) I threw away all that maddening Perl and redesigned the whole system around OMake, which I much preferred. (This was well before JBuilder, which has begotten Dune. This was also well before Google decided to release Blaze into the world as Bazel.)

Since then, I have used Conjury in a whole raft of personal unreleased projects, while OMake has perhaps not been as well-loved by the community as I was hoping it would be when I adopted it. Recently, I’ve been finding it almost as maddening as that pile of Perl I threw away, and the friction is prompting me to search for an alternative.

I’m still not a big fan of Dune, because most of my personal projects are multi-language, with a lot of C++ and C and others in the mix, and the GNU Autoconf tool suite remains a non-starter for me for all the old reasons that remain unchanged. Bazel has a lot to offer me, it does basically everything that originally attracted me to OMake, and I’m trying to get my head around how I can rewrite all the build logic in all my personal projects to use it instead of Conjury.

p2. I’m in a position to influence choices of alternative programming language at my day job.

It’s a reasonably large organization with hundreds of engineers in the software department, and over a dozen people whose full-time job is just wrangling Bazel to build and integrate an insanely large and complex monorepo full of several major systems languages, along with a whole raft of special purpose programming language including some bespoke DSLs.

With the impending release of OCaml 5.0, it’s just starting to be possible to speak seriously about it as a candidate for inclusion in our software ecosystem. It has some attractive qualities for our business case. Until recently the lack of a high quality Bazel rule set for OCaml has made the challenge of demonstrating the applicability of OCaml internally to my colleagues a pretty daunting challenge. (One reason Rust hasn’t taken off at my day job is that the integration with Bazel is considered unacceptable by our dev infra team.)

Summary: The OBazl Toolsuite looks like it has a pretty good chance of sticking the landing on point two: helping me bring OCaml into my day job. It’s not there yet (documentation of the rule set is, understandbly, lagging a bit behind the implementation), but I feel like it will probably get there in reasonable time. On the other hand, my search for a replacement for OMake remains an open problem for me. Only some of my personal projects are packaged with OPAM, and only a fraction of those are released publicly. (I have a private OPAM repository as well as a publicly visible one where I publish development branches for my packages on opam.ocaml.org.) I need to find a tool that can deal well with multiple programming languages, not just OCaml (so that leaves out Dune), and also produce artifacts that can be delivered as OPAM packages (even if they’re written in other languages, not OCaml). I was hoping that I might turn to Bazel for that, but it looks like a daunting challenge.

1 Like

Doesn’t dune get advertised as being able to handle multiple programming languages, including C/C++? There seems to be a whole section on it in the docs: Dealing with Foreign Libraries — dune documentation

Can you remember any of the blockers that prevent using it?

Thanks! Nice to get an attaboy after all these months (I started two(!) years ago).

What sticks in my craw wrt Dune is its opacity. That’s largely what motivated me in the first place. There’s a substantial mismatch between the conceptual structure of the Dune language and the actual build protocol of the OCaml toolset. “Virtual” modules? Lack of an actual build programming language is also a problem from my perspective.

On the bright side, progress has been steady if not evident to anybody but me, hehe. I suppose I should tweet status updates more often.

That is indeed one of my goals, but my first priorities are to make sure the rules are solid and to finish what I consider to be the essential tools needed to ease adoption and migration. Mainly tools that will ease the production and maintenance of BUILD.bazel files, which can look pretty verbose, especially compared to Dune stanzas. Which implies tools that automate OPAM integration. The good news is that that’s pretty close to done, or done enough. The tools work on the test cases I’ve written; I’m now in the process of finding and fixing bugs by running them on real projects.

I guess I should mention that the Dune conversion tool is written C but exposes a Scheme API, which means users who know Scheme should be able to customize it or even write their own tools for processing dune files with relative ease.

Oh yeah, I’ve also got a tool, also Scheme-based, that supports scriptable editing of BUILD.bazel files. More details on all this to follow soonish.

Regarding OPAM packaging and deployment, I’m pretty confident that there’s a good solution to be found. Currently to integrate OPAM one runs $ bazel run @opam//shared:refresh, which writes the Bazel stuff to $XDG_DATA_HOME/obazl/opam (with symlinks to ~/.opam/). But I believe it could write those files directly into the opam switch without doing any harm. So I’m thinking that the tool that does the work could be adapted to run under OPAM, using the post-install-commands field of the .opam file. It would run after OPAM has finished the build and installation, generating bazel files in the opam switch. Once bazelization of OPAM packages is automated, running Bazel as a build tool under OPAM should work. Then deploying an OBazl project as an OPAM package would be a matter of generating an .opam file, which would probably not be too difficult. Writing a conf_bazel file looks fairly easy as well.

Thanks again for the very helpful feedback.


1 Like

There’s really no comparison. Dune evidently can use the (C ABI) outputs of a “foreign” build (if you write the glue code needed to make this work) but there’s no real build integration, and no hermeticity guarantees. Under Bazel different languages use different rulesets but they’re all Bazel rulesets, so you get one dependency graph across all languages, and if the rulesets are hermetic you get a hermetic build. Without ABI restrictions. For example if your build needs to run a Python (or Javascript, Ruby, whatever) tool, Bazel will build the tool and run it for you.

Even for C I think Bazel has much better integration. The rules in rules_cc (e.g. cc_library producing a .a file) deliver a CcInfo provider (a provider is a kind of struct whose fields contain the artifacts delivered by a build action). The rules in rules_ocaml (e.g. ocaml_module) understand CcInfo dependencies and pass them around using OcamlProvider (a provider specific to the ocaml rules). Bazel supports a merge operation for CcInfo, and the ocaml rules always merge their CcInfo deps and pass them on. So every build target delivers the merge of all its CcInfo deps. The ocaml_binary rule that links its dependencies into an executable merges its CcInfo deps (which include merged CcInfo from their deps, recursively) and ends up with a single CcInfo containing every cc dependency in the dep graph, in the right order, with no duplicates. Then its simply a matter of constructing the link command with the appropriate --ccopt options. More succinctly: you can add a C dep directly to the module that needs it, and Bazel it pass it up the dependency chain, ensuring that it ends up on the command line when needed - building archives or executables. You don’t need to add a C dep to an archive target when only one of n modules in the archive actually depends on it.

I’ve just started working on rules_jsoo, which I think will nicely demonstrate the virtues of Bazel integration. The Bazel ecosystem includes a bunch of tools for working with Javascript; for example rules_js and rules_nodejs make it easy to control which node toolchain version to use, integrate npm stuff, etc. Wouldn’t it be nice to be able to use such tools directly, without writing a bunch of glue code? Now a key element of Bazel integration is the use of providers. Rules deliver providers, and since providers act as a kind of rudimentary type system, I can use the JsInfo provider (defined by rules_js) to integrate rules_jsoo with the larger Bazel js ecoystem. For example, the jsoo_library rule takes the OcamlProvider provider delivered by ocaml_module rules, which contains the .cmo file. So jsoo_library runs those .cmo files through the jsoo compiler and delivers the resulting js files in a JsInfo provider. That provider is suitable as input to the rules in rules_js, which gives us seamless integration. So we can use the js_binary rule of rules_js to run code produced by jsoo_library under node. All that’s needed is to list the latter as a dependency of the former. That’s the plan, anyway. Isn’t that nice?




Does it handle C++ packaging systems? Does it handle C++ testing frameworks? Static analysis and code coverage checks? I can just download rule sets for that in Bazel now. Sure, I gotta write all that myself in OMake, and I have written some of it, and it’s in Conjury 2.1. Last I looked, the support for C++ in Dune is mainly about using C++ libraries from OCaml projects, and not about being a first class tool for C++ projects. That’s why I’m not looking at Dune for my personal projects.

1 Like

Thanks both for the detailed explanations. This confirms for me that people with even moderately complex project setups (polyglot setups) are not really benefitting from dune, while people with simple setups are paying the cost of dune’s complexity.

1 Like

I see no confirmation for the second point. A simple project in pure OCaml will benefit tremendously from dune. I don’t know of any other tool that comes remotely close to dune for pure OCaml projects, be they small or large.


I see no confirmation for the second point.

I think I basically agree with that. Dune has some complexity to it for the simple pure OCaml project case that is forced upon it by some characteristics of the compiler tool chain that I consider to be longstanding design errors that are difficult to address. I’m thinking primarily of the gap that was originally filled by ocamlfind and which has since sort of metastasized into a carbuncle of unnecessary complexity that is difficult to excise, given that a proper simplification would probably require lifting library names directly into the syntax of the language so the compiler is aware of them.

I would characterize Dune as the best available tool for simple pure OCaml projects, but it should not be compared to Bazel. The idea of making Dune and OPAM into a replacement for what my day job is using Bazel to do is just not to be taken seriously. Dune is not sufficiently composable, not even really close to it. And OPAM isn’t suited for large-scale monorepo projects; it was not designed for that purpose.

The next time I start an OCaml project (which may be 29 Sept 2022) I will start using OBazel. And everything you write about Dune is true, and it’s why I am implacable in my opposition to it, using autoconf+make for all my projects instead.

Dune is opaque and OCaml-focused; I’ve written projects with OCaml + C++ + Python + Java + Golang. No way is Dune enough.

It’s very simple to get the confirmation. Just watch a newcomer struggle to figure out how to use dune.

1 Like

Wow, back when I was at Google, the major inhibitor was senior leadership. They made it very clear that the only general purpose languages allowed were Python, Java, and C++, and that we should not waste time advocating for more, because there would be huge costs associated with supporting each additional language. Then they added Go and made it clear that that was a rare exception and that we should still not be advocating for other languages. (at least for everyone below Rob Pike’s level)

Are you really talking about using OCaml at Google? Have they abandoned the idea that every engineer should be able to understand any code, or that every basic service should support every blessed language?

As opposed to what?
What’s the build system out there that’s so clear and simple that newcomers don’t get confused?

Bazel solves problems that are not even on Dune roadmap, sure, but would a newcomer be more at ease if the prerequiste to do a simple OCaml project was the understanding of Bazel?

I’ve watched newcomer struggle to figure out how to use ant/maven/graddle/sbt (and that’s only because I’ve doing Java professionaly, I’m sure the list would have been longer if I worked in more ecosystems)

I’m only doing Javascript once in a blue moon, and if the random magic string I’ve copied out of the internet doesn’t do the trick, I’m confused.

From my experience, build tools are complex and confusing and the only real remedy is better docs, start-up guide, books, blogs, etc.
I’m sure there are plenty of ways to make dune better or simpler, but if the goal is the newcomers not being confused, that’s a wild goose chase.

I was thinking more about the humble open source dev deciding which language to use, or which build system to use for OCaml.

Not me, I don’t have any connection to the Google. Then again, if its good enough for Facebook…

No more than if the prerequisite was to understand Dune.

And tools that make it easy to get started, like a “new” command to generate the stuff you need to get started. Ideally somebody learning a new language should not need to spend any time (at first) dealing with a build language too. But that’s easily addressed with some tooling and simple instructions.

OTOH, dealing with the toolchain (including tools like debuggers, ocamlobjinfo, etc.) is an inescapable aspect of learning any language. For any but the most trivial project so is dealing with a build system. No point in writing code if you cannot compile and link it. Build systems often are “complex and confusing”, but that’s largely because the problem space itself is complex and confusing. There’s no getting around that.

A major difficultly with OCaml (in my view) is the complexity and opacity (under-documentation) of the build protocol itself, independent of any build system. Dune goes to considerable lengths to hide it, to the point of sometimes misrepresenting it (in my view; see “virtual libraries”). OBazl takes pains to expose it and make it understandable. I think minimizing opportunities for confusion here is a realistic goal.


This doesn’t only apply to learning. It also applies to prototyping, hypothesis generation and testing.

That’s the reason why I built brzo which I hope I’ll be able to release at some point (still needs a good design review and changes to the OCaml strategy since it assumed we were moving towards a model that didn’t happen in the end – namely the library linking proposal, I’d also like to add more languages to the mix but that could wait).

None of my projects do not start with brzoing these days and the hassle free build experience is exhilarating.

Note however that this is largely accidental complexity due to the fact that compilers work in idiosyncratic ways for what build systems need in order to do their incremental and parallelization business.

They are still stuck in a world where people would invoke their compiler manually at the cli level or specify the invocations themselves in a Makefile.

In fact if it were not for the actual tools and the (lack) of information they give us, build is in fact an excessively simple problem.

More specific to OCaml, the compiler clis have an insane amount of quirks and the whole system greatly suffers from an underspecified linking model. Basically it was not a good idea to let that be defined by a third party tool, if only so that you can actually talk about libraries in error messages from the compiler.


I left Google over three years ago. (For entirely other reasons.) If I were still there, I would not try to advocate for OCaml adoption at Google.

I know it’s still unreleased but I briefly considered it anyway as an alternative to OMake for building my projects that go into OPAM packages. I couldn’t figure it out well enough, but I’m interested to see where it goes. I hope its composability and suitability for polyglot projects turns out better than with Dune. I now see that it was never designed to be suitable for building software distributions. Grmf.

I’m still very intrigued by OBazl because it promises to help make introducing OCaml 5.0 as an experimental language at my workplace a possibility to consider.

Are you mixing with b0 maybe ?

As mentioned in the docs brzo is totally unsuitable for building distributions, it works with heurisitics and the build outcome depends on the state of your environment.

I wouldn’t use it but I’m very glad OBazl exists, when you are a small language it’s good to allow yourself to be easily used in larger systems.

1 Like