When the container images on https://hub.docker.com/r/ocaml/opam2 were first assembled, the intention was to provide a distribution-specific image with all of the OCaml compilers available in a single image. This in turn allowed for CI systems to pull a single image and then opam switch into the different compilers very efficiently.
However, as time marches on, the number of simultaneous compiler images has become quite large, and this feature doesn’t seem to get much use. Therefore, we are planning to change the format of the OCaml compiler images hosted on hub.docker.com:
continue pushing the multiarch images to both ocurrent/opam and ocaml/opam2 from the same https://github.com/ocurrent/docker-base-images instance. This will not take up much extra space since the content on the Hub is hashed and this is just a set of new tags.
ensure that the CI scripts over at https://github.com/ocaml/ocaml-ci-scripts continue to work uninterrupted. The only observable change in the new images aside from the lack of multiple compilers is that WORKDIR is set to /home/opam/src instead of /home/opam/opam-repository in the new images.
Comments welcome on all of this – if you are using the “fat” compiler images to reduce your bandwidth, please let me know. Most users I’ve seen prefer the slimmer single-compiler images instead, hence this change. You can comment here or on the https://github.com/ocaml/ocaml.org/issues/1195 tracking issue.
Most pragmatically, this change will let us update the /r/ocaml/opam2 images to contain 4.11. There are various infrastructure issues that are preventing the pushing of the current larger images that will be fixed by the new ocurrent-based CI infrastructure.
To slim the images, does it make sense to cut off the history of the opam repository that is shipping as part of the the Docker image? The opam repository by itself is large but in a typical use case it only gets updated but not rolled back to an earlier commit.
I looked at docker pull ocaml/opam:debian-10-ocaml-4.09 (about 500MB compressed). The Git history of opam-repository inside the image takes up about 290MB (uncompressed) and packages/ about 150MB. Would it not no tbe a good idea to trim the history in order to slim the image and provide an image with full history if that is what is needed? Or does it become impossible to update such a trimmed Git repo? I would argue that basically nobody needs the git history but many the ability to update the opam repository inside the Docker image.
I would argue that basically nobody needs the git history but many the ability to update the opam repository inside the Docker image.
Thanks for the suggestion – It depends what use you have for the shallow images that do not have git history. A lot of CI systems using these images git checkout to a specific Git revision of opam-repo when doing their tests (including the opam-repo-ci). Without the history, that pinning would fail.
That doesn’t mean that we can’t figure out some combination of git --depth options to reduce the default size of the repo though, since the CI should be able to pull a revision in. One complexity with these images is that the git version is inside the container, and so the CIs need to support some pretty ancient versions (like CentOS 7) which is not yet EOL and has users. But if you can come up with a suggestion of how to use modern git to reduce duplication, I’m sure if we can figure out the backwards compatibility issues.
If you could open an issue on https://github.com/ocaml/opam/issues noting that disk usage is a concern for git checkouts of opam-repo, that would also be helpful. It’s not been a focus in opam 2.1, but we could improve the situation in a later release.
It indirectly has, but we could still potentially do better. opam 2.1 compresses the repositories, decompressing them (to tmpfs) only when they’re updated and/or when the state cache needs to be regenerated.
This was done by @AltGr in order to speed up opam update (and it does - really, really, noticeably!), but it also has the effect of reducing the disk space required for a git clone of opam-repository from ~250MiB to ~95MiB.
I’ve opened this “PR issue” to track the idea of passing --depth=1. It seems to work, at least so far for me, and reduces the repo clone to 15MiB. Comments and further points welcomed there!
This wikipage seems slightly out of date (it mentions 4.10 as the most recent OCaml release), and it explains that “The default container comes with the latest compiler activated, but also a number of other switches for older revisions of OCaml”. If I understand the first post of this topic correctly, this is not true anymore, the set images now has a different structure – or at least the multi-switch images are not recommended to use by default. Do I understand correctly that this documentation is now incorrect?
Thanks! Are you also volunteering to get the documentation for the previous approach clarified? A minimal thing that would help would be to add a disclaimer that points to the new place, but I’m sure someone “in the know” could do something even better.