OCaml real pain points

Could you go into more detail? I’m really not sure what you mean.

1 Like

cmake also uses per-directory build files, and I think avoiding pure single file config is smart. You end up calcifying the directory structure inside the build files, making those files extremely difficult to parse and modify.

The blog post linked above mentioned that JS wanted to port their battle-tested changes to dune upstream. Was that cancelled? It’d be a real shame.

And yet lots of developers prefer to have an oversight of their project in a single file.

Amusingly the scattered build files of ocamlbuild was one of the most repeated complaint made to me when I was doing research in build systems and it got replaced by a system that does exactly that :–)

In general I have been little impressed by the progress in usability of build systems. That’s not unsurprising because 1) programmers are generally poorly educated in the topic[1] 2) these are hard problem. It’s easier to focus your success in terms of “industrial” grade performance and scalability which give engineers precise metrics they can brag about. But this is only really useful for the needs of a handful of large industrial users. When you work on a diversity of projects, performance is certainly useful but you can trade some of it for better usability (or have both!).


  1. If even educated at all. Mind you, in my own curriculum twenty five years ago, HCI was a little taken, optional, topic. Not sure if that changed now. ↩︎

4 Likes

This is something I miss from F#. They write code like list |> List.map (fun x -> …) precisely because x gets its type inferred for member resolution in that order.

And this is the main thing preventing me from fully switching from F# to OCaml. Some domains can’t afford pauses during hot loops (e.g. games, real-time audio); and memory speed has fallen behind CPU speed, so the inability to control memory layouts leaves a lot of performance on the table.

I suggest checking Zig creator’s talk on data-oriented design (even in the compiler) and noticing how much indirection and bloated cached lines we can’t easily get rid of in OCaml thanks to its box-heavy, word-aligned memory model. Not to talk useless GC pressure.

Same. For an example of this not using a C-style preprocessor, the Ada language has something similar: Package bodies take all the definitions from their signatures for granted. However, exposed abstract types are fully defined in a private section of the signature instead, such that the types can be used unboxed.


I would add that the lack of upper bounds in opam packages has bitten me a couple of times. I know the opam maintainers don’t want them unless strictly necessary, but in that case OCaml should probably consider a solver algorithm designed for this workflow. Go’s version resolution algorithm, for example, simply picks the version of the highest lower-bound in the dependency tree (tracking direct-dependency checksums), and turns out that’s enough to not need a central index nor lock files.

3 Likes

I agree. Especially with tools like uv showing what’s possible in the Python space for single file scripts with self-contained depdency config, it would be nice if that kind of utility was available in OCaml (though I think creating an opam switch will always be heavier than a python virtual environment, so perhaps the startup cost would be untenable)

1 Like

Not necessarily. The relocatable compiler work is in review right now, and once it is merged creating new switches by copying existing ones will just take a handful of seconds.

5 Likes

Where are performance gains coming from? Could build actions be further parallelised? Or analysis? Personally I find dune quite fast – what size are projects that need substantially more build performance?

I have to say since I started this discussion, I just said “I don’t like using dune” as an “ocaml pain point”, that doesn’t necessarily mean I think dune can be improved, because I think in many cases the thing I don’t like is a fundamental design choice. And of course I’m thankful the developers made something at all and maintain it and they’re entitled to design it the way they like, but it so happens it doesn’t suit my use well and it feels reasonable to say so in the “pain points” thread. Not having the time or expertise to write an alternative build system from scratch I’m at best vaguely hoping to encourage whoever might try their hand at some point. In the same way I might say I don’t like that if-without-else statements don’t have terminators or that Set.Make(Int).fold and List.fold_left have different argument orders, I’m not expecting it to change any time soon.

To talk about the concrete issues, at least these things I don’t like seem to me to be non-negotiable parts of the design:

  • S-exps
  • The use of dune files in every directory (I plead guilty to not having written down what usability issue I hit when I tried using subdir)
  • The requirement for dune-project and opam files and having a notion of package and the public_name stuff (like this error), what I call the “bureaucratic” aspect of dune
  • Related, all the rules, aliases, renaming patterns (dune exec ./file.exe where ./file.exe is not a real file) that are built-in instead of being easily configurable or extensible
  • The fact you have to go through dune to call code which makes it hard to use for informal script-like things or multi-language projects

Here are things that I think ought to be improvable but in practice I’m not sure what one can do about it from the outside:

  • Warn-error. This comes up on here all the time. According to the discussion here, the developers don’t want to change it because they think they need a system to save all warnings first. For my personal use this is completely unneeded but I’m not even a professional user, I’m not in a position to convince anyone here.

  • The documentation. I don’t have any sort of expertise to write doc about dune, but also, my impression is that the developers are totally aware that the documentation is very lacking. See for instance this post (and the whole thread — btw the list of stanzas is still not sorted on the website, even though there is a PR that claims to have sorted it, I’m rather confused). In this post from 2023 it is mentioned that a rewrite effort is ongoing, as far as I can tell this has been very partial since the general aspect of the doc has not changed, especially the reference part.

    My issue with the documentation is the general lack of discoverability (due to the unstructured dump of esoterically-named options) and the lack of precise, complete description of what this or that does, which often forces you to experiment to find out (here is a recent example of a question I had to experiment to answer). Again I think the developers know, the first issue of discoverability especially is brought up in the thread I linked many times.

With all that said, as part of working on a bigger project right now I am compiling a list of usability issues I hit on the way (with dune or other things) that I will try to act on or at least post about when I am done.

5 Likes

For me, the biggest day-to-day struggle is the lack of rich refactoring tools that work across a large codebase. Even fairly mechanical refactorings like renaming a type, moving a function between modules, reorganizing a set of related types tend to require a lot of manual search-and-replace and compiler-driven cleanup.

I also agree with the pain of mutually recursive modules and the duplication between mli/ml files

1 Like

Elaborating… I’m specifically complaining about the overly complex distinction between ocamlfind packages and opam packages, which was bolted onto the toolchain ecosystem in a kind of after-market enhancement, because the language and the compiler tools have no intrinsic support for identifying and distinguishing library frameworks.

I’ve long believed that the compiler toolchain itself needs to own the protocol for packaging libraries for distribution. It shouldn’t be delegated to 3rd-party tools like ocamlfind, dune and other build-related tools. I don’t want to be too prescriptive about how I think the tooling should work, but there are a lot of existing systems in other programming languages that could be informative in developing a design for it.

I should say that I am aware of efforts in the dune community at attempting to smooth over these rough edges, and my opinion about that is dune is the wrong place to do it because it’s overly focused on being a tool for Ocaml-centric development rather than a general purpose tool for polyglot development.

As a result, one of the major reasons I hesitate to mount any serious efforts at bringing OCaml into the environments where I do my day jobs is that OCaml’s interop story with polyglot build and packaging systems is so difficult to explain to people. Most of the pain there is related to the ocamlfind warts.

Although, another minor pain for me at my current day job is the lack of decent support for private OPAM repositories hosted inside AWS Codeartifact. I feel like that’s a lesser problem, which could be solved with less trouble, probably with an opam plugin. At least I hope so.

5 Likes

My experience over the last three of my jobs in Silicon Valley has taught me that you don’t really have to be a very large enterprise at all to start encountering the sorts of build performance problems that make EngFlow an attractive option.

It doesn’t matter what programming languages you are using in your technology stack. It’s easy to land in a place where you need something like EngFlow to keep your developer velocity from cratering. Lack of any kind of production-ready build tooling for OCaml that could be integrated into any of the various things like EngFlow was the basic non-starter that made considering OCaml as an option impossible.

2 Likes

I haven’t done any ppx programming in a long while so I’m not all that aware what devs need. From what I recall, one should be using ppx_expect or mdx to write such tests. Addressing your post more directly, I think a pform like %{ppx:foo+bar+baz} should make it easy to do what you want. I’ll see if I can add that.

The bottleneck here isn’t the lack of desire but the lack of people willing to do the work.

Once we allow the subdir inside dune-project files I think you’ll be able to do everything through a single file.

1 Like

I haven’t kept up with this, but wasn’t there a set of bazel rules for OCaml that was fairly mature at this point? Have you given it a try?

1 Like

The part I’m having trouble with is that is that ppx_expect / mdx / CRAM usually operate on the outputs of a build, whereas ppx debug logging is generated during the build. In that other post I received a suggestion to handle this by calling dune itself from within an action, but I have yet to make that work.

Thanks, that would be very nice indeed.

Understandable. Thanks for all of your work!

Progress seems to have stalled out just before crossing the finish line. Documentation is still incomplete and transfer of sustaining engineering to the community organization seems unlikely at this point.

At my current day job we’re actively resisting the adoption of a monorepo model, so Bazel is less interesting to me now, but I think the team where I work now is on the losing side of that long term struggle.

I REALLY want something like this. Python was so painful for a long time, but uv is a game changer.

For reference:

Please take my words with a grain of salt as I’m not an insider. My understanding is that there was a strong will to upstream, unfortunately it didn’t work. Now janestreet would be ok with open sourcing their engine, so that it could be used as inspiration. But we’d need to have hands to re-implement all the performance improvements. And the energy to do so is unclear. Kind of an unfortunate situation, as everyone is willing to cooperate but it didn’t give the best returns (yet).

2 Likes

Looks like we might be getting it: Announcing the first release of Alice, a radical OCaml build system

4 Likes

It doesn’t look like it. Closest thing we have so far is probably b0caml.

2 Likes

This would be very welcome!

There is a proposed doc reorg pending here Rearrange docs to make the content more discoverable for newcomers by sabine · Pull Request #12612 · ocaml/dune · GitHub, if you have relevant opinions, it could be very helpful to have a more outside perspective.

AFAICT, the main issue here is that there is not a consensus about what the defaults should be. Some folks will end up unhappy with the defaults. But it is easy to configure alternative settings, which I’m guessing is why it is not so critical to get the defaults changed?

Could you elaborate here? It is a main use case of dune to build binaries that can be used alone and put wherever you want. So no need to “to thru dune”. And you can install stuff into your systems easily with opam. But I’mn wondering what exactly you have in mind here. Is it something like the uv functionality discussed below?

To wit: Streamline the UX for newcomers · Issue #9006 · ocaml/dune · GitHub

1 Like