Dune lockfiles are not cross-platform. How should we approach checking them into version control?

Dune can create a “lock directory” containing a package dependency solution for a project. The command to generate the lock directory is dune pkg lock, and the dependency solution will be specific to the machine (OS/arch/distro/etc) where that command was run; if you run dune pkg lock on a linux machine, the resulting lock directory will not work on a mac. The lock directory is named “dune.lock” by default. The directory contains one text file for each package in the project’s dependency closure, with copies of some of the fields from the package’s opam file (converted into dune sexps). If a project has a lock directory, it can be built along with all its dependencies, without an installation of opam present.

The context for this post is that the dune developers are considering what work needs to be done before we can start promoting the widespread use of dune package management as an alternative to opam.

Before encouraging the widespread use of dune package management, I’d like to avoid the situation where a developer generates a lock directory for a project and checks it into version control, and then a second developer finds they are unable to build the project because they’re using a different type of computer to the first developer, and the project’s lock directory is specific to the first developer’s computer.

I claim that naming the lock directory “dune.lock” by default will encourage developers to check it in, as that is the convention in other ecosystems where lockfiles are similarly named (at least for applications). The solution proposed in this issue is to change the default name to “.dune.lock” which will hopefully at least give developers pause before checking in their lockdirs. The issue proposes a workflow similar to local opam switches, where dune is used to install a project’s dependencies into a per-project sandbox.

There is an argument for checking in lockdirs despite them being specific to one type of computer, which is that it allows people using identical computers to one another to guarantee reproducible builds. This might be useful in corporate settings where the company can guarantee that all developers will be using identical machines, or for solo devs who only have one computer.

Dune does support a project having multiple different lockdirs, specific to different OS/arch/distro/version/etc combinations, but I don’t think we should encourage the use of this feature as there could be potentially many lockdirs for a project and they could drift out of sync.

Dune also has a feature where the specific revision of the opam repository that was used to solve a project’s dependencies can be recorded in the dune-workspace file. This alone acts somewhat like a lockfile, and can be safely checked into version control and used on different machines. In this case, dune running on two different machines with the same OS, architecture, etc. will produce the same package solution. Solving a project’s dependencies can take several 10s of seconds in the worst case, and the lockdir is effectively a cache of the result of this process.

I’m interested to hear what other ocaml users think about dune lock directories. If dune created a lock directory for your project named “dune.lock” would you be likely to assume it to be a general-purpose lock directory and check it into the project? Would naming it “.dune.lock” instead make you less likely to check it in? Given that lockdirs are not cross-platform, under what circumstances would you still check one into version control anyway?

3 Likes

Here are my expectations. If I run a command dune pkg lock, I expect it to output a single file, maybe dune-project.lock, that just contains a listing of all the resolved versions of dependency packages. I then check this one file into my repo and anyone who uses it will then get those exact resolved versions, and it will work cross-platform.

A really great workflow to look at here is Go’s package management workflow. It works very similarly to what I described, with the equivalent files being dune-projectgo.mod and dune-project.lock (proposed) → go.sum. I believe we can take away some great lessons from the DX of this locking system.

4 Likes

I am not a heavy user of package managers, but my feeling is that given the highly heterogenous computing landscape nowadays, committing lockfiles which are not cross-platform is pretty much useless in most cases. On Windows, for one, it is exceedingly common to switch between native Windows and WSL. Even on Linux, lockfiles would not be portable among different distributions (if I understood correctly), and so on.

Accordingly, I think lockfiles should not be exposed in a way that makes them “committable” until they can be made platform-agnostic. Committing the opam-repository hash seems a better alternative in the meantime.

+1

Cheers,
Nicolas

3 Likes

I don’t know if the word “should” be taken literally here. Users are a diverse bunch and their portability needs differ widely. The average user (like myself for example) does not pretend to support every combination of operating systems, distributions, external package managers, architectures that opam-repository does. Even opam-repository, despite its best efforts, has well known limitations here.

So no, it doesn’t make sense to make a feature unavailable to users until it meets an arbitrary portability criteria.

The main problem with that is that this approach useless for distributing software Downloading the opam repository and running the solver are an expensive additional step. This is a rather important need for delivering said software to end users, running things fast in CI, etc. Moreover, many people complain about performance with opam. I
think that these issues are a part of the problem here.

Why would they go out of sync? There’s no issue with generating them all at once and they can be generated on any machine.

I think the real issue is that they would become hardly readable in their current form.

That’s bad, but what’s even worse is having to download this repository before you even do anything.

All platform lock-files generated on any (single) machine? That is a pleasant surprise. I assumed like @gridbugs that a platform-specific lock file could only be generated on that platform.

If the capability exists to generate all platform lock-files on any single machine, then we can serialize the map of platform lock-files into dune.lock and call it a day.

Readability is a laudable goal but at least for me it is unnecessary in the first versions.

I don’t understand. If a specific package supports a set of operating systems–let’s say Windows, macOS, and Linux–then why would its lockfile only work on the OS it was generated on? What makes the lockfile OS-specific? I’ve never heard of package managers generating an OS-specific lockfile before.

1 Like

Lock file generation is entirely deterministic (I hope) and only refers to input from opam repositories and pins. So patience is the main limitation here.

I just checked how cargo serializes lock files for multiple platforms, and it just it includes all versions of conditionally selected packages for all “features” with some deduplication.

We could just easily do that with the caveat that one needs to list the os/distro/arch supported explicitly. Opam, unlike cargo, allows for more flexibility here than Cargo, and “mutually exclusive features” (and more) are supported. As a consequence, it’s much harder to know which platform specific variables a build plan is sensitive to.

Readability is a laudable goal but at least for me it is unnecessary in the first versions.

Well we’ve kind of achieved it already for “non portable” build plans. Lock directories are very diff friendly and specifically contain a file per package. It is incredibly useful to compare what changed between invocation of the solver this way. This is something that many similar tools fail badly at (flakes, cargo, etc)

What lock file would you expect to be generated if you have the following package?

depends: [
  "foo" {os = "linux" & >= 1.0.0}
  "foo" {os = "macos" & < 1.0.0}
]

On Linux,

depends: [
  "foo" {os = "linux" & >= 1.0.0 & src = "https://.../foo-1.0.0.tbz" & checksum = "sha512=..."}
  "foo" {os = "macos" & < 1.0.0}
]

On macOS,

depends: [
  "foo" {os = "linux" & >= 1.0.0}
  "foo" {os = "macos" < 1.0.0 & src = "https://.../foo-0.9.9.tbz" & checksum = "sha512=..."}
]

Ie generate exact locked versions on the OS we are on, but keep the lockfile in a working state if someone tries to use it on another OS. Then let the user on the new OS know that the lockfile has been updated with OS-specific metadata.

You’re still only fixing the versions for a single platform. Therefore by any reasonable definition, your lock file isn’t portable.

We also have the property that a build plan in one configuration can be executed in another - even if it isn’t guaranteed to work. I don’t know if this matches what you mean by a “working state”.

Didn’t really understand this one. If the lock file is being updated, then it’s not really being used on the new platform? How is this better than just generating it from scratch on the platform?

Right, but it’s a best-effort attempt when the package has mutually exclusive dependencies on different OSs. I think we both agree that if dune is running on a specific OS, it can’t be reasonably expected to generate lock data for the dependencies of another OS? And in the case of dependencies which are the same across OSs, it works portably.

If the lock file is being updated, then it’s not really being used on the new platform?

Yes, but since using the lock data of another platform is not a reasonable expectation, updating it with the lock data of the new platform seems like a good compromise, if we inform the user that this happened (and anyway, they can easily see it in the diff).

How is this better than just generating it from scratch on the platform?

This way we can have a single file with all lock data needed for every platform; platform-independent lock data remains the same and is shared; and platform-specific lock data are all viewable in one place.

1 Like

Not only do we disagree, but in fact dune can already do what you say it shouldn’t, and generate lock data for any OS on any OS. You just need to tell dune which exact configurations you need the build plan for. Dune doesn’t care about producing build plans for any imaginable config, it just can’t help you tell if they are useful (by executing them).

This happy case also works for us. The problem is that many fundamental dependencies (like the compiler) do have platform specific deps - at least as expressed in the opam repo. So we don’t end up encountering the happy case in practice very often.

Sure, that sounds a lot more reasonable. The question is now whether this should be committed? I imagine the maintainers of this project might not care about your specific configuration. They might not have the means to test it for example.

I think you’re going to find that the list to cover “every platform” is going to be too large to handle in practice. Opam repository supports w architectures, y operating systems, x distributions, z os familes.

1 Like

Great! Then what prevents it from putting them all in the same lockfile?

we don’t end up encountering the happy case in practice very often.

OK, I see. So you are saying that every package would end up with a large wall of platform-specific deps data, like the data shown here? opam-repository/packages/ocaml-base-compiler/ocaml-base-compiler.5.2.0/opam at master · ocaml/opam-repository · GitHub

The question is now whether this should be committed?

If we are talking about a project maintained by someone else and they don’t want to (eg) support macOS dependency lock data, then they can of course reject my PR to update the lock data, and I would have to maintain it myself. If it’s a project that I control and I/my team want to support both Linux and macOS, then we will commit the lock data for both the OSs.

I think you’re going to find that the list to cover “every platform” is going to be too large

Sorry, maybe bad wording on my part? I don’t mean ‘every platform’ as in ‘every conceivable platform that exists’, I just mean ‘every platform that I decided to support for this project’.

1 Like

Thanks for the comments @rgrinberg!

Just to make sure I understand: if one uses a preexisting compiler in the environment (instead of building it with Dune – is this supported by the way?), and only pure OCaml dependencies (no C bindings), does this situation fall in the “happy” case?

Cheers,
Nicolas

This might help, but not as much as you think. The formula for the system compiler still evaluates a lot of platform specific variables: opam-repository/packages/ocaml-system/ocaml-system.5.2.0/opam at master · ocaml/opam-repository · GitHub

I think excluding the compiler altogether is the only reasonable approach if you want to hit this happy path.

Nothing. You can already just have multiple lock directories and select which one you want on different platforms. I don’t think having in the same lockfile is materially different.

That would be the bare minimum - yes.

I think that’s a lot more reasonable and matches quite closely what I would expect users to do. We already have all the building blocks to do this already, we just need to bless some defaults to make it smoother.

1 Like

I was including the time it takes to download the repository in my time estimate here. You only pay that cost the first time you try to lock a project on a machine. Running the solver is usually less than 10s and there’s low hanging fruit to optimize it further. Also the time it takes to build real projects along with all their dependencies (I’ve been using ocamlformat as a benchmark lately) is in the order of minutes, so the cost of solving doesn’t dominate the time between cloning and running the project. Of course, the dune cache will speed up builds, but not the first time a user builds something and the cache is cold, and similarly after the first time a user runs the solver (and downloads the opam repo) solves will be much faster.

I want to explore this idea a little more. I hesitate to call this useless for distributing software. Running the solver is deterministic - it’s a referentially-transparent function of the opam-repository revision and the platform-specific solver variables (os, arch, distro, etc). Currently lockfiles are only useful in situations where all developers are using very similar computers (ie. solver variables are the same on all computers), and in this case, running the solver on each machine (with opam-repository revision checked in) will produce an identical lockfile on each machine - checking it in in this case is purely an optimization. Running the solver on a different type of computer will of course produce a different lockfile, but I claim this is preferable to a user attempting to use a checked-in lockfile on an incompatible computer as the default behaviour of Dune.

Because the solver is deterministic, if the inputs to the solver are checked-in (the opam-repository revision), checking in the outputs of the solver (the lockdir) should be purely seen as an optimization to avoid the need to download the opam repo and run the solver. Currently checking in a lockdir only acts as an optimization on machines compatible with the lockdir, and requires extra configuration to prevent errors on machines that are not compatible with the lockdir. New users will not realize lockdirs are platform-specific if they are named dune.lock as other ecosystems name their platform-agnostic lockfiles in a similar way.

I think we should aim to make lockdirs act as an optimization with no side effects (on any machine) other than speeding up builds/solves. Then it will be safe to check them in as the presence of a lockdir won’t make a project inaccessible to users of certain types of computer. Until that point, I think we should encourage users to achieve reproducible builds by checking in the solver inputs (the opam-repository revision) instead of the lockdir. Advanced users are welcome to check in platform-specific lockdirs and configure their projects to choose a specific lockdir on a specific platform, but the default lockdir generated by the majority of users (who probably don’t realize at first that it’s platform-specific) should not be checked in.

When was making things inaccessible ever proposed? Clearly, the solver is always available as a fallback. In fact, such users are no worse than in your scheme. If anything, they’re better off because they know they are trying an untested build plan that has not been vetted.

A solver-less lock file will always have better performance when it’s an option. You are saying the gap will narrow, but I think that if anything, it will widen. If we ever implement binary caching, downloading all binary artifacts for all packages will be done concurrently and will finish before the opam repository is downloaded.

I’ll give one more reason why I dislike relying the opam repository commit hash. The solving technology available to dune today is problematic. I don’t know how we’re going to address this issue, but eventually we will have to explore some options. As a consequence, dune reserves the right to change the solver algorithm without versioning. This means that we will not be able to guarantee that we’ll produce the same build plans given a particular repo hash. Users will find that their build plans are in fact not so predictable between dune versions. Lock directories have no such problem since they are versioned explicitly (because it is much more feasible to do so).

Regardless, I recognize the issue of lock directories today. Indeed, I’ve put a lot of work into making opam repositories first class in dune. I would not have done that if I thought the workflow was useless. I just think that it has its own disadvantages and is not a good fit for many projects. I don’t know whether it should be the default or not, but I think we should try to evolve both workflows and explain how they compare to users.

2 Likes

Just to make it plain for those of us in the back row - what is the tuple that would guarantee you a valid lock-file/dir?

Presumably (x86-64, linux, debian-12) should be enough but I’m guessing (x86-64, linux, some-rolling-release) wouldn’t be (because a C-library might be a dependency and get upgraded). Likewise anything with Windows-11 and a C-library you need to download.