Depending on non-OCaml languages from the opam repository

I’m currently reviewing feature requests for the opam repository, and one of the most common ones is for us to support non-OCaml toolchains as dependencies from OCaml packages submitted to our package repository. In recent years, there have been a number of OCaml libraries that depend on Rust, Python or Node, and cannot be easily tested in our automated infrastructure (which currently uses a fixed base image per distribution).

I’ve put together a prototype way we might solve this easily, without taking on the burden of maintaining non-OCaml toolchains ourselves with limited maintainer resources. Opinions and ideas welcome on this thread, and the repository is at: GitHub - avsm/opam-lang-repo: Install various language development tools via opam

A multi-language devcontainer package repository

Devcontainers are an emerging mechanism to use container runtimes as a full-fledged development environment. They can support multiple programming languages in one filesystem by means of features, which allow for the activation of a given toolchain alongside others. For example, using features allows for the simultaneous use of Python, Rust and OCaml within one container image, whereas with traditional devcontainers there would be a separate container for each toolchain.

Using the opam solver to manage feature selection

The opam package manager integrates a builtin constraint solver that allows for the selection of a compatible set of dependencies from a package repository that contains all released versions of all packages.

This repository translates published devcontainers into opam packages, such that devcontainer features can be selected by simply adding dependencies to an opam package. Additionally, version constraints on the desired tooling can be added to pick the required versions. For example:

depends: [ "dev-rust" {>="1.68"}
           "dev-ocaml" {>="4.12" & < "5.0"}
           "dev-python"
           "dev-python-optimize" ]

This picks a version of Rust greater than 1.68, and any OCaml compiler between 4.12-4.14, and any Python compiler with the optimize flag activated for more efficient code generation.

The Good News

This solution frees the opam-repo maintainers from having to support the myriad other toolchains, and lets us depend on them freely from opam. By adding explicit dependencies like this, we can continue to run automated end-to-end tests for new and existing packages in the OCaml ecosystem, even when they do not exclusively use OCaml.

The Bad News

There are still some limitations to figure out before this is production worthy:

  • The devcontainer installation busts the opam security sandbox, and so cannot be installed simultaneously with normal packages. It would be ok in a CI system where sandboxing is normally disabled. Another option is for these packages to not actually perform the installation, but generate a single install.sh with all the right environment variables. An image generator could then run that script to generate a base image.
  • opam doesn’t currently support composing remote repositories, so some strategy is needed for how to keep this generated repo in sync with anything included in the central repository.
  • Need to support devcontainer boolean defaults correctly (e.g. Python feature), and figure out what to do about arbitrary string options. Env variables could be used to pass in values, but opam can’t recompile if these variables change. Dune does support systematic env variable tracking and recompile if it changes, so this would work in a monorepo.
  • Need to extract feature dependencies into the opam formula as well.
  • Something, something, Nix, instead?
10 Likes

A feature of Archlinux packages is that the version of the library/software/w/e being installed is tracked separately from the version of the package installing it. E.g., this python3 package is for Python version 3.11.3 but it is the second such package (hence the 2 in the package name and also in the pkgrel field in the package file).

This is useful for decorelating software changes and packaging changes. And I think it could possibly be useful in this case where we have packages to describe some pieces of software that have their own versioning.

Would it be useful for devcontainers? Maybe even for some conf- packages?

IIRC opam notion of version (e.g., what’s a valid version number, how to compare version numbers) is based on Debian. Is that the case? Still the case? How would a pkgrel work if that’s the case?

That’s a good question, and important to make the whole thing reproducible. In the case of devcontainer descriptions, there is a separate version number for the actual installation script (e.g. see this for Python), and really ought to be tracked somewhere.

The versions we are exposing to users of the repository will not be this version, but rather the version of the programming languages themselves, much as we do with ocaml {>= "4.14"} today in the opam-repository.

Solving this would also solve the “package revisionism” issue, but would require a much bigger change to the opam repository. It needs to happen, but probably separately from this thread. See package revisionism can break things · Issue #10531 · ocaml/opam-repository · GitHub for a related issue.

I don’t have a good answer for how to track the meta-versions yet without manually tracking epochs (which seems error prone)… it’s something that is worth all of us collectively thinking about though.

This repository translates published devcontainers into opam packages, such that devcontainer features can be selected by simply adding dependencies to an opam package. Additionally, version constraints on the desired tooling can be added to pick the required versions.

This is a very nice trick! Do features already contain such constraints, or is it something that is expected only for end-users of these features?

I also see that you are using a lot of depots and conf packages. It’s a fine use of these features (similar to the way we do this in opam-repository for configuring the OCaml compiler) but I’m wondering if you could simplify some of this by using an obscure feature of opam: the ability to set build variables (either globally or per package) via the environment via setting up OPAMVAR_<var> (or OPAMVAR_<pgg>_<var>.

For instance, instead of this:

install: [
  "env"
  "VERSION=%{version}%"
  "INSTALL_TOOLS=%{dev-python-install-tools:version}%" {dev-python-install-tools:installed}
  "OPTIMIZE=%{dev-python-optimize:version}%" {dev-python-optimize:installed}
  "INSTALL_PATH=%{dev-python-install-path:version}%" {dev-python-install-path:installed}
  "INSTALL_JUPYTERLAB=%{dev-python-install-jupyterlab:version}%" {dev-python-install-jupyterlab:installed}
  "CONFIGURE_JUPYTERLAB_ALLOW_ORIGIN=%{dev-python-configure-jupyterlab-allow-origin:version}%" {dev-python-configure-jupyterlab-allow-origin:installed}
  "HTTP_PROXY=%{dev-python-http-proxy:version}%" {dev-python-http-proxy:installed}
  "bash" "./src/python/install.sh"
]

You could set-up an environment by scrapping devcontainer-feature.json:

OPAMVAR_dev-python-install-path_install-tools=$(jq .options.installTools.default < devcontainer-feature.json)
OPAMVAR_dev-python-install-path_install-path=$(jq .options.installPath.default < devcontainer-feature.json)
opam [...]

I don’t know if it’s much simpler here, but i found this trick useful in the past when writing scripts. Also I’n not totally sure why you are encoding this variable into its version field. Couldn’t you just use a string and a package-level variable?

Devcontainers potentially solves the CI issue, but it does not seem to solve a problem of integrating opam with language-specific package managers. Currently using two Rust binding libraries in one binary will just explode at linking phase. See Cargo/dune integration and Cargo/Opam packaging of a Rust/OCaml project - #9 by Konstantin_Olkhovski.

At least for Rust case it would probably be helpful to automatically create local repository of Rust crates, that are coming as part of opam packages, so that when I install an opam package with Rust bits, my project build rules can use that cargo repo, generated by opam, and use specific versions of Rust crates, so that bindings work as expected.

For Python probably some other approach is required. At least some metadata in opam files should be present to encode external Python or Rust dependencies, or presence of Python or Rust source code artifacts within the opam package, so that some tools can build on top of that metadata, get the pieces of Python/Rust code to appropriate places (local repo, etc) so that they are available during build of current project/running tests.

Nix is really good for multi-lingual projects. I’ve created a proof of concept for opam to use Nix to provide non-OCaml dependencies: github.com/RyanGibb/opam-lang-repo-nix.
You can add it to opam with opam repo add opam-lang-repo-nix git+https://github.com/RyanGibb/opam-lang-repo-nix.git and add a dependency to your opam file as, e.g. nix-rustc.

The opam files looks something like:

opam-version: "2.0"
synopsis: "Rust"
install: [
  "nix-env"
  "-iA"
  "rustc"
  "-f"
  "https://github.com/NixOS/nixpkgs/archive/28e0126876d688cf5fd15da1c73fbaba256574f0.tar.gz"
]

Nix doesn’t, in general, package all the versions of a package (github.com/NixOS/nixpkgs/issues/9682). You can install a previous version by using a specific git revision of Nixpkgs. I’ve cherry-picked some git revisions using lazamar.co.uk/nix-versions/ to get an MVP working for the rust compiler and python interpreter. This is where the 28e0126876d688cf5fd15da1c73fbaba256574f0 you see above is from.

Some advantages:

  1. Reproducible and deterministic dependencies. I don’t know a lot about devcontainers, so please correct me if I’m wrong, but it looks like they just run a large number of large ‘install’ bash scripts. I can imagine the user loosing track of the side-effects of these scripts, and the exact versions of the dependencies they pull in, and not being able to easily reproduce their development environment.
  2. Benefit from the huge number of packages in Nixpkgs.
  3. Allows multiple versions of a package to coexist on the same system without any interference. If the devcontainer install scripts are sandboxed/containerized (again I’m hazy on the details), then another advantage of Nix is it allows transparent sharing of identical dependencies.
  4. Atomic upgrades and rollbacks, which again come for for free with Nix.

Some issues:

  1. This requires nix to be installed. We could have opam install nix as well.
  2. It similarly requires sandboxing to be disabled. We could pull in the Nixpkgs revision in advance but Nix will still have to pull in the sources for the derivation closure it’s building.
  3. This uses nix-env to install dependencies to the local user’s profile. While this is less side-effectful than install scripts, as dependencies are still constrained to the Nix store, it would be more desirable to create a shell in which these dependencies are available. Nix flake devShells could be useful.
  4. Similar to:
  • We would want a way to keep this repository up-to-date with upstream Nixpkgs. The project I linked to, github.com/lazamar/nix-package-versions, is already doing something very similar in keeping their website up-to-date.

A potential solution to issues 1, 2, and 3, would be for opam to generate a flake.nix file that contains all the dependencies. If we wanted to take this even further, opam could also generate nix derivations for packages from opam-repository. This could be through of as the inverse approach to something like github.com/tweag/opam-nix.

3 Likes

While I do think Nix would be a great solution for Linux and CI needs. I do worry that it would quickly become a hindrance for any other system. I know NetBSD and FreeBSD both have some Nix support, but every time I’ve attempted to use them the experience just hasn’t been worth it.

Sadly I suspect the devcontainers to have similar problems.

I don’t believe it’s a general solution to this problem, but for development asdf-vm has always worked well for me. I hope that whatever is tried I have an escape hatch to something similar on unsupported systems.

1 Like