.opam directory - Why so many directories and files?

In learning about opam read about init.

opam needs to initialize its internal state in a ~/.opam directory to work.

So I used Linux tree, to see what was there expecting less than a page of meta info.

groot@Galaxy:~/.opam$ tree

Needless to say those that know would not be surprised at the result but I was


28309 directories, 33903 files

Why so many directories and files?

Does one really need all of the extra versions of the code?

I believe the sources are stored there. Opam does not distribute binaries, it has to download sources and compile them.

Oh, and keeping the source is useful, for instance if you want to install the same package on another switch, or because a dependency was updated, you do not need to download it a second time.

Thanks.

I get that having the versions local is nice but

groot@Galaxy:~$ sudo du -sh .opam
1.9G .opam

taking up 1.9G is something that a users should be made aware and given a choice before using up that much space. I assure you that if I need space on my drive the entire OCaml plethora will be one of the first to go. If I were making Docker containers for separate projects, think VSCode remote development, this would not do. :slightly_smiling_face:

2 Likes

While I in generally agree that maybe 1.9GB is a lot of data for effectively metadata I think a lot of this is coming from using computers in the 90ies where hard drives were about 2GB in size.

But 30 years later, how much is 1.9GB in storage? I just looked at a price comparison website and it doesn’t even list price per GB anymore, instead opting for price per TB. Looking at SSDs this is about 0.07€ per GB so the the one-time storage cost of your copy of opam-repository (which is growing rather slow) is about 14ct.

I am not sure this 14ct saving investigation has been worth your time.

The compiled binaries are also in this folder, and this includes the compiler(s). Many compilers take a lot of space, for instance on my computer, the rustup binary takes 7.7M. Having three versions of that compiler would approach 2go. If you have a few switches with a few packages installed on each, I would say that 2go is very reasonable.
Maybe there is a command to discard the sources though ? That would speed up everything.

Not sure what the expected “speed up” is. But look at opam clean --help.

Thanks.

Here are the results

groot@Galaxy:~$ opam clean
Cleaning up switch 4.13.1
Clearing cache of downloaded files
Clearing logs

groot@Galaxy:~$ sudo du -sh .opam
1.8G    .opam

groot@Galaxy:~$ opam switch 4.11.1

groot@Galaxy:~$ opam switch
#  switch   compiler                    description
→  4.11.1   ocaml-base-compiler.4.11.1  4.11.1
   4.13.1   ocaml-base-compiler.4.13.1  4.13.1
   default  ocaml-system.4.08.1         default

groot@Galaxy:~$ opam clean
Cleaning up switch 4.11.1
Clearing cache of downloaded files
Clearing logs

groot@Galaxy:~$ sudo du -sh .opam
1.6G    .opam

groot@Galaxy:~$ opam switch default

groot@Galaxy:~$ opam switch
#  switch   compiler                    description
   4.11.1   ocaml-base-compiler.4.11.1  4.11.1
   4.13.1   ocaml-base-compiler.4.13.1  4.13.1
→  default  ocaml-system.4.08.1         default

groot@Galaxy:~$ opam clean
Cleaning up switch default
Clearing cache of downloaded files
Clearing logs

groot@Galaxy:~$ sudo du -sh .opam
1.6G    .opam

Using the -a option

groot@Galaxy:~$ opam clean -a
Cleaning up switch 4.11.1
Cleaning up switch 4.13.1
Cleaning up switch default

groot@Galaxy:~$ sudo du -sh .opam
1.6G    .opam

Using the -c option

groot@Galaxy:~$ opam clean -c
Clearing cache of downloaded files

groot@Galaxy:~$ sudo du -sh .opam
1.6G    .opam

Using --unused-repositories

groot@Galaxy:~$ opam clean --unused-repositories
Updating /home/groot/.opam/repo/repos-config

groot@Galaxy:~$ sudo du -sh .opam
1.6G    .opam

groot@Galaxy:~$ opam clean -a
Cleaning up switch 4.11.1
Cleaning up switch 4.13.1
Cleaning up switch default

groot@Galaxy:~$ sudo du -sh .opam
1.6G    .opam

Using -r

groot@Galaxy:~$ opam clean -r
Clearing repository cache

groot@Galaxy:~$ sudo du -sh .opam
1.6G    .opam

Checking if update with upgrade adds more back

groot@Galaxy:~$ opam update

<><> Updating package repositories ><><><><><><><><><><><><><><><><><><><><><><>
[default] synchronised from https://opam.ocaml.org
Now run 'opam upgrade' to apply any package updates.

groot@Galaxy:~$ opam upgrade
Everything as up-to-date as possible (run with --verbose to show unavailable upgrades).
However, you may "opam upgrade" these packages explicitly, which will ask permission to downgrade or uninstall the
conflicting packages.
Nothing to do.

groot@Galaxy:~$ sudo du -sh .opam
1.6G    .opam

Me neither, I meant to say “free up some space”.

1 Like

Storage size might not be the best metric to evaluate the footprint of Opam. For example, on my computer, despite being relatively small with respect to the total size of the drive, the .opam directory is already occupying 20% of the inodes of the filesystem. And I suppose that, for people using lots of switches, the ratio grows even higher. So, people might inadvertently saturate their filesystem by creating Opam switches, while there is still a lot of unused space on their drive.

6 Likes

Yes, fair, I haven’t thought of that.

Upon looking into why I was never affected by it it seems to be because APFS on macOS has 64-bit inode numbers and btrfs doesn’t have inodes in that sense. ZFS also has 48/64 bit inode numbers, NTFS on Windows seemingly doesn’t have inode limitations, so out of popular systems this seems to be mostly affecting ext4.

Note that switches don’t have their copies of opam-repository, so on one machine you usually only need one copy of it (per user). For example, I currently have 35 local switches (and two copies of opam-repository, since I have a git clone of the repo but that’s only important if you submit packages).

I wondered the same recently. (Transferring my work folder involved the copy of 1’000’000 files, of which more than 600’000 due to local opam switches (the rest was mostly a personal copy of the opam repository), for more than 22GB. It took a long while to copy over ssh. This was after clearing the caches, and I wished there had been an option to erase even more needless information even if it forced me to recompile later on. This goes against @Leonidas’s experience, so I do not completely understand what is going on.)

This argument made me react:

how much is 1.9GB in storage?

A contributor to OCaml made a similar argument recently regarding RAM usage (the details are unimportant here). I have not felt limited in SSD, but I have been limited in RAM despite an amount that should have been more than enough, and the proposition was likely to result on the OOM-killer being triggered for me.

The first problem is assuming that one’s own situation is representative. Even in case one particular assumption happened to be sort-of correct most of the times, if you integrate the personal assumptions of every contributor made by the seat of their pants, you end up with a product that fits no one (average person fallacy). One should be wary of extrapolating from incomplete personal experience, and instead be happy to sometimes say “let’s not assume”.

The second problem is that everybody making the same reasoning is probably what has been responsible for the inflation in RAM usage across all programs, meaning that with just a desktop, a web browser and a java app running in the background (and for some reason a process called snap-store taking 600MB which I never used) there has been little room left even today. My new machine has twice the RAM and it is only half-full, let’s see how long it will take for developers to catch up.

This reminded me of a complaint from a PL researcher about dependency hell with opam, who explained to me that the advice they have been giving in his group was to create a new local opam switch for each project. So they probably end up with lots of switches, and they accumulate over time.

(I complain more about the argument than about opam; I had never noticed the issue before, and then it was only one time.)

5 Likes