Portable External Dependencies for Dune Package Management

Dune lock directories record the names of any system packages needed to build projects or their dependencies. We’ve recently changed how this works in the
interest of portability.

Background on depexts in Opam

A system package, or external dependency, or depext as I’ll refer to them from now on, is a non-Opam package which must be installed in order for some Opam package to be built or for code in a package to be executed at runtime. These packages must be installed by the system package manager, or by some other non-Opam means such as manually building and installing the package from source. Common types of depext are build tools such as the pkg-config command, often run to determine linker flags while building a package, or shared libraries such as libgtk, which an OCaml project might link against to create GUIs.

Opam usually installs depexts automatically. Opam knows how to invoke many different system package managers (such as apt or pacman), so when installing a package with depexts Opam can run the commands appropriate to the current system to install the required packages using the system’s package manager. For this to work, Opam needs to know the name of the package within the package repository appropriate to the current system, and these names can vary from system to system. For example the pkg-config command is in a package named simply pkg-config in the apt package manager on Ubuntu/Debian systems, whereas in the third-party homebrew package manager on MacOS it’s in a package named pkgconf. In order to determine the right package name for the current system, the package metadata for Opam packages with depexts contains a list of all the different known package names along with the conditions under which that name is correct. Here is that list for the conf-pkg-config Opam package:

depexts: [
  ["pkg-config"] {os-family = "debian" | os-family = "ubuntu"}
  ["pkgconf"] {os-distribution = "arch"}
  ["pkgconf-pkg-config"] {os-family = "fedora"}
  ["pkgconfig"] {os-distribution = "centos" & os-version <= "7"}
  ["pkgconf-pkg-config"] {os-distribution = "mageia"}
  ["pkgconfig"] {os-distribution = "rhel" & os-version <= "7"}
  ["pkgconfig"] {os-distribution = "ol" & os-version <= "7"}
  ["pkgconf"] {os-distribution = "alpine"}
  ["pkg-config"] {os-distribution = "nixos"}
  ["pkgconf"] {os = "macos" & os-distribution = "homebrew"}
  ["pkgconfig"] {os = "macos" & os-distribution = "macports"}
  ["pkgconf"] {os = "freebsd"}
  ["pkgconf-pkg-config"] {os-distribution = "rhel" & os-version >= "8"}
  ["pkgconf-pkg-config"] {os-distribution = "centos" & os-version >= "8"}
  ["pkgconf-pkg-config"] {os-distribution = "ol" & os-version >= "8"}
  ["system:pkgconf"] {os = "win32" & os-distribution = "cygwinports"}
  ["pkgconf"] {os-distribution = "cygwin"}
]

depexts in Dune

Dune doesn’t install depexts automatically as the Dune developers are a little nervous about running commands that would modify the global system state. This may change at some point, but for now Dune only provides support for listing the names of depexts, leaving it up to the user to install them as they see fit.

The dune show depexts command can be used to list the depexts of a project. For that command to work the project must have a lock directory. Here’s an example of listing the depexts of a project:

$ dune pkg lock
...
$ dune show depexts
libao
libffi
pkgconf
sdl2

I ran these commands on a Mac with homebrew installed, so the package names are from the homebrew package repo. Each package listed there is one of the depexts of a package whose lockfile appears in the project’s lock directory. Let’s look at how this information is stored. Using pkg-config as an example:

$ cat dune.lock/conf-pkg-config.pkg
(version 4)

(build
 (run pkgconf --version))

(depexts pkgconf)

The relevant part for us is the depexts field. The current released version of Dune only stores the package’s depexts for the system where dune pkg lock was run. The command dune show depexts simply concatenates the depexts fields from each lockfile in the lock directory.

When thinking about portable lock directories I always like to imagine what the experience would be using Dune for a project where the lock directory is checked into version control. I frequently switch between using two different machines for development - one running Linux and the other running MacOS. If I was to check in the lock directory I just generated on my Mac, and then check it out on Linux and continue development, dune show depexts would show me a list of packages for the wrong system!

Portable depexts in Dune

To make depexts portable, one’s first instinct might be to use the same approach as taken with the depends field outlined here, listing the depexts for each platform for which the solver was run. Indeed such a change was added to the Dune Developer Preview when we first introduced portable lock directories, however we quickly realized a problem.

The depends, build, and install fields of a package rarely vary between OS distribution. It’s reasonably common for those fields to be different on different OSes, but very rare for them to also be different on different OS distributions. As such, it’s expected that users will elect to solve their projects for each common OS, but there would be little value in solving projects for each OS distro. In fact solving for multiple distros would slow down solving and bloat the lock directory, and users would somehow need to come up with a definitive list of distros to solve for.

But the depexts field is highly-dependent on the OS distro since package names are specific to the package repository for a particular distro. Recall that the depexts field in Opam package metadata lists package names along with the conditions under which that package name should be used, e.g.:

["pkg-config"] {os-family = "debian" | os-family = "ubuntu"}
["pkgconf"] {os-distribution = "arch"}
["pkgconf-pkg-config"] {os-family = "fedora"}
["pkgconfig"] {os-distribution = "centos" & os-version <= "7"}

These conditions almost always involve the name of the OS distro, and to make matters worse they also sometimes involve the OS version, as packages can change their names between different versions of the same OS. Evaluating these conditions at solve time for platforms with no distro or version specified tends to result in lockfiles with no depexts at all, since all the conditions evaluate to false.

The use case we have in mind for depexts in Dune is that a user will solve their project coarsely, usually just for each common OS with no consideration for distribution or version. Then when they run dune show depexts, the depexts will be listed using names appropriate to the current machine. This means Dune needs to store enough metadata about depexts to compute system-specific depext names at a later time. This means storing the same names and conditions as are currently stored in Opam files, and deferring evaluation of the conditions until as late as possible, such as right when dune show depexts is run.

The latest version of the Dune Developer Preview does just this; translating the depexts field from each package’s Opam file into a Dune-friendly S-expression. After this change, the depexts field of conf-pkg-config’s lockfile is:

$ cat dune.lock/conf-pkg-config.4.pkg
...
(depexts
 ((pkg-config)
  (or_absorb_undefined_var
   (= %{os_family} debian)
   (= %{os_family} ubuntu)))
 ((pkgconf)
  (= %{os_distribution} arch))
 ((pkgconf-pkg-config)
  (= %{os_family} fedora))
 ((pkgconfig)
  (and_absorb_undefined_var
   (= %{os_distribution} centos)
   (<= %{os_version} 7)))
 ((pkgconf-pkg-config)
  (= %{os_distribution} mageia))
 ((pkgconfig)
  (and_absorb_undefined_var
   (= %{os_distribution} rhel)
   (<= %{os_version} 7)))
 ((pkgconfig)
  (and_absorb_undefined_var
   (= %{os_distribution} ol)
   (<= %{os_version} 7)))
 ((pkgconf)
  (= %{os_distribution} alpine))
 ((pkg-config)
  (= %{os_distribution} nixos))
 ((pkgconf)
  (and_absorb_undefined_var
   (= %{os} macos)
   (= %{os_distribution} homebrew)))
 ((pkgconfig)
  (and_absorb_undefined_var
   (= %{os} macos)
   (= %{os_distribution} macports)))
 ((pkgconf)
  (= %{os} freebsd))
 ((pkgconf-pkg-config)
  (and_absorb_undefined_var
   (= %{os_distribution} rhel)
   (>= %{os_version} 8)))
 ((pkgconf-pkg-config)
  (and_absorb_undefined_var
   (= %{os_distribution} centos)
   (>= %{os_version} 8)))
 ((pkgconf-pkg-config)
  (and_absorb_undefined_var
   (= %{os_distribution} ol)
   (>= %{os_version} 8)))
 ((system:pkgconf)
  (and_absorb_undefined_var
   (= %{os} win32)
   (= %{os_distribution} cygwinports)))
 ((pkgconf)
  (= %{os_distribution} cygwin)))

That’s a 1:1 translation of the depexts field from conf-pkg-config’s Opam file. There’s enough information there so that the appropriate package name can be computed on demand rather than just at solve time.

This bring us a step closer to a world where Dune users can check their lock directories into version control with confidence that their builds are reproducible across different platforms. To try out the latest version of the Dune Developer Preview, go to preview.dune.build.

7 Likes

Nice work, Steve, and the rest of the team!

One non-standard depext backend that doesn’t modify the global systems state is for Nix: Opam's Nix system dependency mechanism

From https://ocaml.org/changelog/2025-05-19-portable-lock-directories-for-dune-package-management:

I wonder if uv’s approach of a forking resolver 0 could tracktably solve for all platforms to create a portable lock file?

1 Like

Hey Ryan, nice work with with opam-nix! Looks like that will solve a problem that I currently manually work around (by creating shell.nix files for each of the ocaml projects I work on, urgh haha).

We have talked internally a bit about following uv’s approach to resolution since it solves the very problem that we have with opam packages being allowed to vary their dependencies in different environments. Our initial approach is deliberately naive so we could get it working quickly, and at some point we’ll hopefully get the opportunity to change the solving algorithm to be both faster and more general. Thanks for the link!

1 Like

Thanks for the informative writeup, @gridbugs! Python recently standardised its lockfiles (after considerable debates) a few months ago in PEP-751, and there are two things of note:

Lock files can be single-use and multi-use . Single-use lock files are things like requirements.txt files, which serve a single use-case/purpose (hence why it isn’t uncommon for a project to have multiple requirements files, each for a different use-case). Multi-use lock files represent multiple use-cases within a single file, often expressed through extras and Dependency Groups.

…which is broadly similar in philosophy to what you propose above, although the details of how you encode them differ of course. They also note rejected ideas:

A previous version of this PEP recorded the dependency graph of packages instead of a set of packages to install. The idea was that by recording the dependency graph you not only got more information, but it provided more flexibility by supporting more features innately (e.g. platform-specific dependencies without explicitly propagating markers).
In the end, though, it was deemed to add complexity that wasn’t worth the cost (e.g. it impacted the ease of auditing for details which were not necessary for this PEP to reach its goals).

What I found interesting about the Python PEP approach is that (due to the scale of their ecosystem), they have many different implementations of the PEPs (e.g. Poetry, uv, pip-tools, PDM, etc), and this forces them to think about the metadata format as the specification as opposed to the tooling interface.

We seem to be deviating from this approach, with dune becoming the second source of metadata with these sexp lockfiles. We already have opam.lock files, where you can drop constraints to let the solver take over those bits (this is how we hand-edit lockfiles currently to be “more portable”), and then the proposal above for dune which is more automated and succinct, but different.

I think it’s important to avoid splitting our OCaml metadata ecosystem (as opposed to the tooling ecosystem, which works over metadata). Have you considered what a more unified format might be for the locking metadata? This is not to say that dune shouldn’t advance, of course, but consideration upfront about the implications of what we’re asking our users to check into a project repository is important, as those files then tend to linger for many, many years.

9 Likes

Not addressed at me, but I might be able to respond with something useful anyway. Early in the project, there were a few properties that I thought were desirable:

  • A lock file should be enough to determine the source and build instructions of every single package that was used to build the package universe.

  • It is possible to build the package universe described by the lock file with a reasonable amount of effort. Crucially, without invoking any complex solvers or other types of black boxes.

  • To make it convenient to distribute software to end users (who may not even be users of the OCaml language at all), it should be possible for end users to start building a package described by a lock file as soon as possible. In other words, by immediately downloading and building the package dependencies and not waiting for a package index to arrive first.

Needless to say, opam’s lock file format did not meet of any the above goals.

It might be interesting to know why I found these properties to be so important. Early in the project, I was learning about nix and flakes and was left very impressed. To cut to the chase, the first property is stolen directly from flakes. The second property is there to make it as easy as possible for a dune2nix like tool to builld a “switch” using nix’s omni powerful caching and CI. The 3rd property came from waiting on nixpkgs (opam-repository for nix) to download one too many times.

I was also very skeptical of an effective lock file requiring a solver at all to be useful. It greatly complicates the second point, and requires some sort of a package index to fill in the gaps left by the constraints. I’m not so confident about this now as we’ve managed to make the solver in dune much faster and better behaved (building on top of the great work by talex5 and the opam folks).

3 Likes

Thanks for writing up your goals. For what it is worth, I’d like to point out that opam swtich export files (see opam/src/format/opamFile.mli at 2.3.0 · ocaml/opam · GitHub, or the “opam switch” manpage, look for import and export subcommands) fulfill the properties – especially those with --full and --freeze (options available since opam 2.1.0):

  • the file includes the build instructions, and source links, etc. for every used opam file;
  • you can use opam switch import <file> and won’t use the solver;
  • you can distribute these files, users/clients don’t need any opam repository to be present.

Since this is a dump of your opam installed switch, you may not get the platform-portability that this topic is about. But I think the file format and tooling is sufficiently ready and can be used as an interchange file format – so maybe “dune package management lock files” could be converted to these switch-export files.

In case you’re curious, here’s one of these switch export files.. FWIW, we use them since years for our reproducible build infrastructure, and for rebuilding packages (see https://builds.robur.coop in case you’re curious). Here, both --full (to embed potential extra-files/patches (no longer present in the main opam-repository)), and --freeze (to point to a specific git commit in case you’ve git-pinned packages) are used (and are necessary for reproducing the build).

I understand the term reproducible can be defined in various ways, I stick to the definition of https://reproducible-builds.org/ (bit-wise identical output). We developed orb to conduct these builds and rebuilds.

By no means I want to argue for or against something, I only want to emphasize that the @avsm mentioned interchange file format may already exist since a couple of years; and it may be worth to use this more.

6 Likes

Right, I forgot that those exist. I do not think those are what Anil had in mind though. I think our format still has a few advantages over this format, but there’s no denying that this format is workable as well. It could be interesting to add first class support for building such switch exports in dune.

1 Like

The opam switch export format is just fine by me as well. My initial delight at having an independent implementation of opam file parsing in dune has been tempered by seeing the file formats diverge :slight_smile: Ideally, there will be co-evolution between both projects. For instance, opam switch exports also have enough information to reconstruct an opam-repository-format directory structure, but there’s no tooling for that at the moment in opam.

I also wonder how much feature creep is delaying the release of dune package management, which has been in “preview mode” for over a year now. I’ve never had a need for portable lockfiles in the OCaml software I’ve shipped, since there are always a specific combination of arch/OS binaries that are tested. We just maintain lockfiles for each of those combinations, and if a consumer of the repo is on an untested path, the absence of a lockfile is a useful signal that the software itself might not be that well tested on that specific platform. @ryang has been leading some research here on universal package solvers, and it’s primarily highlighted that there are many subtle differences in how different package managers handle external dependencies.

I’d prefer to have a first dune release that matches current opam functionality, but crucially with a versioned metadata file in the style of everything else in dune, so that error messages and the exact version of the dune binary being used don’t matter as well. We do that for everything else in dune, and it’s somewhat frustrating to get sexp parsing errors while juggling dune binaries, environment variables and compilation options just to try out package management. Once you have a versioned file, then translating that to/from the opam file formats seems achievable.

I think the package management feature itself is great… but shipping it is crucial to getting real feedback!

2 Likes

I don’t think portable lock files are essential either. It’s true that package management in dune still has some limitations and bugs, but it’s certainly good enough for a meaningful chunk of people. Nor will lifting these limitations and fixing the bugs significantly change the workflow for existing users.

I’ll start the discussion for stabilizing the lock directory format.

2 Likes

Just wanted to share a cross reference on these two points:

IIUC, these are addressed pretty thoroughly in this reply from earlier in the week: Migrating OCaml.org to Use Dune (Developer Preview) - #7 by maiste

But I it also seems clear that the framing of the current binary and its configuration as a “preview” is causing some confusion. We are interested in figuring out ways to improve the clarity there.

Good to see this topic has generated some interesting discussion while I’ve been on holiday!

If we do end up stabilizing Dune package management without portable lockdirs and portable depexts (which I don’t personally recommend) we’ll need to be super clear in messaging/docs that Dune’s lockdirs are not portable, and about under what circumstances (if any) that users are recommended to check their lockdirs into version control or add them to .gitignore. Dune will be the only language package manager I’m aware of that generates non-portable lockfiles for projects, so it’s likely that users (especially new users trying Dune for the first time) are going to assume the lock directories are portable and check them in, possibly leading to surprising behaviour.

3 Likes

Wouldn’t dune just ignore a lock dir if it saw it was for a different OS or architecture? Maybe print a notice that it’s recalculating the locks? To me it makes sense to have all the lock dirs live side by side.