Best practices around using external libraries on opam?

orbitz · June 6, 2020, 8:43am

I would like to make sure that if github is down that I can still build my Ocaml project via opam. The way my builds work is a fresh build VM on every build that installs everything via opam and then produces my artifact.

What are the best practices in terms of maintaining local copies of all dependencies? Is it easy to maintain an opam cache on, say, S3, that all of my installs will go through?

mbacarella · June 6, 2020, 9:12pm

If I had this concern I’d create an opam local switch and see if copying my project’s _opam directory to the build VM after I clone my project into it but before I run the build works.

yawaramin · June 7, 2020, 6:22pm

You could build and upload a Docker image containing your project build dependencies already built and in the dune cache, and start every build from there. The image would need to be rebuilt every time dependencies changes, which would be annoying and time-consuming, but when dependencies don’t change it should be fast because it would just download the image and start the build there.

Another option would be to use Nix as a dependency management system, and Cachix as a build artifact caching service. It would definitely need more setup, but would give you a fine-grained build dependency artifact cache.

mbacarella · June 8, 2020, 3:45pm

Actually, what’s your specific concern with GitHub? That you may have a networking issue that makes GitHub inaccessible, or that GitHub itself could have robustness issues that seize your workflow?

FWIW, this is something companies worry about quite a bit.

Google opted not to use GitHub to coordinate Chrome and Android open source development because (among other reasons) they were concerned about GitHub’s single point-of-failure. At the time GitHub was only able to serve out of one datacenter. Instead Google developed a multi-master Git service and their own CR/CI system, Gerrit.

Presumably Microsoft also learned that they were becoming very strategically dependent on GitHub and solved this problem by acquiring them.

I believe GitHub has addressed the all-eggs-in-one-datacenter concern since then though.

AltGr · June 9, 2020, 1:57pm

There are several levels of cache used by opam.
① normally, it does not rely on github servers, unless you manually configured it to use the git repo rather than the one at opam.ocaml.org
② package archives are mirrored at opam.ocaml.org too, which is the primary source opam will try to fetch from
③ and unless you run opam clean, archives are kept locally in ~/.opam/download-cache

Of course, if you use the repo through git directly, that may disable the use of the remote cache ; or you may have packages bound to git urls.
At least for the former, you can manually configure: archive-mirrors: "https://opam.ocaml.org/cache" in ~/.opam/config to have opam attempt to use the cache there unconditionally of where the package definition was found.

Last, generating a local cache is quite easy:

git clone opam-repository
optionally, use opam admin filter from the clone dir not to cache the entire repo…
run opam admin cache to fetch all archives locally
use that local repo through opam repo add --all, or just configure the cache dir globally as above (using a file: URL)

XVilka · June 10, 2020, 5:26am

I also asked a similar question some time ago: How to setup local OPAM mirror

TLDR; there is no easy step-by-step guide, so I opened a bug to document/change opam for easier process of setting up the local or a regional mirror: https://github.com/ocaml/opam/issues/4103

jjb · June 11, 2020, 12:38am

A related option that we use (to keep facebook build machines from DOSing opam or github) is to build a tarball of the dependencies based on a lock file using the extract_mini_repository script. This has worked well for us, but I guess that you need to be willing to go to a lockfile based workflow or you will be churning dependencies very often and have significant skew between local and CI builds.

Topic		Replies	Views
How to setup local OPAM mirror Learning	13	3446	October 14, 2020
Opam Local Library Install Missing Dependancy Ecosystem opam , build , dune	2	564	December 8, 2023
Learning OCaml - a few tooling questions Learning opam , dune	13	845	April 2, 2023
Opam init when opam.ocaml.org is blocked? Learning opam	2	659	April 30, 2021
Cache with gh action `ocaml/setup-ocaml@v2` Ecosystem opam	7	593	March 13, 2023

Best practices around using external libraries on opam?

Related topics