Is opam vulnerable to the recent package management supply chain attack?

From Researcher hacks over 35 tech firms in novel supply chain attack

Birsan noticed some of the manifest file packages were not present on the public npm repository but were instead PayPal’s privately created npm packages, used and stored internally by the company.

On seeing this, the researcher wondered, should a package by the same name exist in the public npm repository, in addition to a private NodeJS repository, which one would get priority?

Birsan soon realized, should a dependency package used by an application exist in both a public open-source repository and your private build, the public package would get priority and be pulled instead – without needing any action from the developer.

So I guess the question is–does opam allow using both a private and a public repository in the same project, and does it automatically prioritize the public version of a package when solving for dependencies?

I believe that opam asks you to set priority when you introduce a new repository in your switch. The users decide which repositories get priority (which sounds like the right choice). (Note: you cannot automatically upload software to the public-repository opam package, it is approved by its maintainers, so you would have to be a bit more deceitful to mount a similar attack, although of course this is still possible.)

The Conex project is working on improving the integrity and authenticity of opam repositories, based on recent research on The Update Framework. (Advertisement: this year the OCaml Software Foundation is funding further work on Conex, see the Robur blog post.) Conex does not prevent all kinds of supply-chain attacks, but it lets users check that the packages are really authored by their claimed authors, not some malicious third party.

4 Likes

The default repository does not have a harcoded higher priority. In particular, this is how the alpha repository works during the alpha or beta release of new version of OCaml: adding a fresh repository gives it a greater priority than the default one, and thus the patched packages shadow the default one.

Without addressing the specific attack you link, there is no doubt that opam is susceptible to much simpler “supply chain” attacks involving only the canonical opam repository and its processes. See Opam-repository: security and data integrity posture for some discussion.

For those that aren’t familiar, the biggest headline from that IMO is that opam releases are not immutable (i.e. authors can update the contents of a previously-published library release) as a matter of policy. This means that a malicious actor can land a payload into a library consumer’s build even without that consumer e.g. changing the version number their project depends upon.

3 Likes

OK based on this design and what @gasche mentioned as well, I am taking that to mean that opam is not vulnerable to this particular supply chain attack–since presumably a private repository would be added as a ‘fresh’ repository on top of the ‘default’ public opam-repository, and thus private packages take priority over public ones.

2 Likes

I disagree with this conclusion (sorry about that). So, in opam (AFAIU) there are ranks for repositories which are used as tie-breaker. But consider the following scenario:

# opam repo
<><> Repository configuration for switch test <><><><><><><><><><><><><><><><>
 1 testrepo2 file:///usr/home/hannes/devel/mirage/testrepo2
 2 testrepo1 file:///usr/home/hannes/devel/mirage/testrepo1
 3 default   https://opam.ocaml.org/

I suspect the first number is the rank. According to opam repo --help:

           Package definitions are looked in the repositories in
           increasing rank order, therefore 1 is the highest priority.
           Negative ints can be used to select from the lowest priority, -1
           being last. set-repos can otherwise be used to explicitly set the
           repository list at once.

So testrepo1 has a single opam package named extra-file in version 1.0.0, testrepo2 has the same package in version 0.0.0. An opam install extra-file installs the highest version (from testrepo1), not the highest version from the repository with the highest priority.

TL;DR: the package names and versions are merged from the repositories, and the rank only plays a role if multiple repositories contain the same version of a package.

5 Likes

Interesting! Thanks for the analysis! I think you’re right about the opam repo output rank. I believe you can double-check with the more verbose opam repo --all command.

Indeed, leading to the same ranks as above.

Now the question is “how to avoid this threat?” (at least once this is considered to be a threat worth mitigating, eventually there are easier attacks possible in the current state of opam and opam-repository).

I can think of a couple of options:

  • ( a ) only ever use a single opam repository (i.e. inside a company run your own opam-repo without automatic updates from the public repository)
  • ( b ) revise the rank handling in opam - if a repository with a higher rank has a package named “foo”, only consider that repository (this will modify the semantics of opam, likely break some use cases)
  • ( c ) prefix/namespacing of repositories (or qualifying packages with full url) – depend on testrepo2:extra-file or https://github.com/.../testrepo2:extra-file explicitly
  • ( d ) deploy signed package releases and custom delegations (using conex, that’s not yet possible with the current code - TUF has some specs how to do that, I’ve some ideas how to achieve this with conex): if your company has a custom package “foo”, specify the public keys that are allowed to release “foo”, the attacker would need to get their hands on these private keys to inject a “foo” release.

I suggest go for ( a ) right now :wink:

1 Like

Another possibility: more sophisticated package namespacing and publishing capabilities, e.g. what npm has done with scopes like @publisher/package and allowing only authorized users to publish under the @publisher namespace.