A Proposal for Voluntary AI Disclosure in OCaml Code

That might be true, but then who steers the political positions of opam-repo? There’s absolutely no framework today for the community to make decisions. Say you organize a poll “should opam-repo ban any package that has been touched by a LLM” and 55% of the respondents agree. Is that enough? Does this mean 45% of people go away and fork the opam-repo?

4 Likes

I have the impression that some arguments given here assume that the opam-repository is just a repository for opam. In particular the argument saying that their could be other repositories. I believe that it is not the case in practice. Just from the top of my head (and correct me if I am wrong):

  • the package documentation on OCaml.org is build from opam-repository
  • the CI and check for new OCaml release run against opam-repo
  • for questions as “Does this feature is widely used?” often the answer comes by looking at packages in opam-repo
  • Sherlodoc and sherlocode are also based on opam-repository

Also I would like to point out that, if we have an AI disclosure procedure, then it would not be too hard to have an LLM-free repository which could be automatically sync with opam-repo.

3 Likes

It would mostly be up to the opam repo maintainers (and whatever body manages them if any) TBH. If maintainers upstream agree that generated content is too much of a burden, and there is a wide acceptance among repo consumers downstream that these artifacts are radioactive, tainting, etc., this banning may be an effective way to signal that fact to the rest of the community, gently steering them away from making an otherwise easy choice to submit such content.

We’re arguing as if half the opam repo is LLM output. Nothing will happen today if such decision is taken. OTOH it immediately signals trust over a level of assurance in upstream and sets the rules for upcoming submissions.

It won’t stop motivated “bad actors” in practice – although there’s already little gain from being a “bad actor” in this case, and there are a few things to lose as pointed out.
It adds sufficient social/process friction and a layer of participation – people can report generated content that slips through for example.

This generally just makes the better choice the easy one. Annotation makes the better choice more work and a personal choice, so then an automatically filtered slop-free repo is useless. A manual one is practically a hard fork with little support because it removes rather than adds packages. Essentially an inversion of what I discussed earlier.

I actually think that there are some very direct and immediate implication. Like stopping corporate actors from contributing code and moving away from ocaml.

1 Like

Genuinely why? What’s the logic behind [Corporate Actors], with the countless hours of labor and millions of lines of OCaml code they already benefit from and invest in, with their ability to trivially bankroll, like, one employee to keep check on a repo extending upstream or EEE-style slopfork your favorite library, what’s the logic behind them just … directly and immediately dropping all that and rewriting their stack in some other ecosystem? For not getting a library into opam?

Does opam currently accept proprietary libraries? Wouldn’t rejecting those alienate some other corporate actors which may want to benefit from the opam infra etc? Same deal.

Large entities will easily figure some way to live with this. It’s volunteers with their limited bandwidth and funding who’ll suffer a thousand papercuts dealing with bot code and conversing with bots in submission reviews and running bot artifacts in CI.

I suppose pushing in this [Corporate Actors]-pleasing direction would just put us closer to the “the uploader holds complete liability” model of npm, in order to remain sustainable. And I think walking away from that direction is saner for everyone.

If opam is permissive to LLM output, that also means you get the short end of it, not just the maintainers, because as more and more packages get tainted, more and more interactions with packages and docs become a question of “is this hallucinated”, the question of “is this feature widely used” becomes meaningless because bots could hyperfocus on an unused feature much like they hyperfocus on rarely used words and punctuation in prose. And so on.

1 Like

Just my 2 cents:

It would make sense to have a filter or a view on the opam repository that only exposes LLM-free packages. This requires updating all packages with (retroactive) LLM disclosure. Should also be noted whether the policy applies to all AI – I assume only to LLM-based, not to symbolic code synthesis.

It would require being precise what LLM-free means. What if generated code is retyped by a human? From the ocaml-ai-disclosure perspective, whether the cutoff should be at none, or inclusive of ai-assisted, and the boundary of ai-assisted.

1 Like

We’re a teeny tiny community, and our goal is to attract people to the language, not to put barriers in their way. There are enough barriers already present just based on our weird language idiosyncrasies as compared to the dominant languages in the world. Additionally, we’re entering a dangerous (from our perspective) era of programming where the very concept of needing to specialize in a particular language is called into question, as LLMs get better every day at translating between different languages.

At the same time, there is great potential for us to fill in the massive gaps in our ecosystem with the aid of LLMs, and part of our selling point stems from our language’s ability to aid the code generation process by giving strong type-based feedback. If someone who is less knowledgeable in ocaml is attracted to the language, and generates a library that’s missing in the ecosystem using an agent, should we really be pushing them away?

Adding friction at any time, but especially at this point in time, is a serious self-inflicted wound IMO.

4 Likes

Thanks, everyone, for the thoughtful and polite feedback to my proposal. I’ve received a lot of private comments as well, from many perspectives, so I’ll attempt to digest them here.

The prevailing concerns seem to hinge around quality and security and (to a lesser extent) legalities. This is not to diminish the debate around ethics, but this is such an active and evolving topic that I can’t pin much down there yet.

Security

This is a growing concern for the opam-repository, and is one I think that goes well beyond CVE tracking. We often use a social signal as opam repo maintainers to “sniff” a packaging PR and browse around the original source to ensure that it’s reasonable. In many cases, we offer suggestions to the package submitter, many of whom apply those changes.

Now, however, with LLM generated content, this social signal is demolished since every package comes with confidently verbose reams of text. It’s no longer practical to assess code by quickly reading through it, and we’ll need some other measure or automation to help out here. I offer no quick solution here, except for some emerging type driven linters that can distinguish “bad vibe coding” from the more curated agentically boosted approaches.

A major problem here is that backdoors could slip quite easily into this high volume code, which leads onto the next topic of quality.

Quality

We’ve resisted measuring popularity by the number of downloads in opam, preferring instead to look for more stable metrics such as the number of downstream dependencies on a package. This signal has been pretty good; there are islands of popular maintainers and packages, and the opam repository serves to aggregate them all and sort out incompatibilities at package submission time via constraints. In other words, the opam repo is a collective database that is more than the sum of the individual packages.

With LLM generated code, there’s often a desire to ‘throw something over the wall’ and not keep it updated. If we accept these sorts of packages into the opam repository, we’re not improving the health of our collective database, since unmaintained packages could rapidly accrue dependencies without humans behind them.

Therefore, our maintainer intention field might become more important moving forward. I can see us accepting LLM packages (that are beyond a minimum level of slop that we can leave to opam repo maintainer judgement) that are set to a maintenance intent of none. This would, at least, be honest, and a signal that other people are welcome to pick up the baton and iteratively improve that particular effort.

A useful improvement to opam itself may be to avoid packages in the dependency chain that have declared themselves unmaintained.

Legality

This one’s the most potentially serious, especially given the diverse and international nature of our contributors (from individuals, to corporates, to academic). Unfortunately, it’s also the most in flux; the current legal situation is murky, varies by country, and is being actively legislated almost everywhere.

The goal of my proposal above is voluntary disclosure to make future provenance easier to figure out, but I have doubts it’s going to take off: even within my own group, people are reluctant to disclose AI usage for a variety of reasons. Some worry it’s a poor social signal, others have it tightly integrated into their workflows and treat it like a code editor, and yet others are not computing experts and do not distinguish.

However, if you do have strong opinions, then now is the time to feed back to your legislative bodies! @samoht pointed out to me that the EU is seeking feedback on Article 50, so I’ll be submitting a synopsis to that.

SO what do we do next?

I have just three concrete suggestions for now:

Make maintenance intent first-class in opam

We could promote the x-maintenance-intent field to be a first class opam field, and actively ‘solve around’ unmaintained packages. We have this really fancy solver, so why not use it?

Improve tooling for multiple package repositories

opam supports handling multiple simultaneous package repositories just fine. In fact, we’ve got two active ones: ocaml/opam-repository and ocaml/opam-repository-archive today.

What’s missing is the tooling to manipulate, filter and merge multiple opam repositories easily (I pushed repomin for this purpose). Having better tooling here would allow us to (for example) have:

  • an opam repository just for all OCaml compilers. This is extremely useful for the developers and packagers and testers to have just the build rules and patches in one place.
  • an opam repository that’s compatible with Windows, with non-building packages filtered out.
  • an opam repository that’s got just the latest versions of packages (an equivalent of Stack).
  • an opam repository with only a core of curated and maintained packages that’s small and portable.
  • an opam repository that explicitly accepts ‘work in progress’ LLM generated outputs, for those who want to live on the agentic bleeding edge.

Is it time to consider a reputation system?

@hannes has worked on conex for many years, but it hasn’t been pushed into opam repository due to the significant hassle involved in key management for end users.

Is it now time to bring back a system like this, but with vouching as a first-class feature? The good folk at tangled.org have been building in “evidence” to their vouching system, which took me back to the good old days of Advogato (for the really oldies among you!).

As with all such efforts, this will require coordination and contribution from all interested in making such change happen :slight_smile: I’m very willing to be corrected on anything I’ve raised above!

14 Likes

+1 for a reputation system.

I firmly believe in greater author autonomy in the era of AI-assisted programming. This requires greater author responsibility.

I’m a proponent of soft reputation systems for individuals in specific professional areas, with transparent governance and organic scoring. There should also probably be an option for individuals to opt out of the score being displayed publicly. Keep in mind I’m not an expert on reputation systems and many social aspects often escape me.

For context, here’s how I see AI-assisted coding as of the recent couple months:

  1. Producing code a lot faster than in the old days implies that much of this code won’t be scrutinized by humans as much as it used to be due to the bounded supply of human reviewers.
  2. Avid AI users tend to be confident about their AI-assisted code while others tend to distrust other people’s AI-assisted code even more than non-AI code written by strangers. This is based on a complex mix of considerations including technical, psychological, social, political, and ideological factors. There is a lack of trust.
  3. Writing AI-assisted code is perceived by its practitioners as equivalent to pair-programming, with the AI assistant at the keyboard and the human holding the bag of chips. They have a sense that there’s no need for yet another human reviewer if the requirement is that a human reviews the code.
  4. The human in charge can request at a whim that large parts of the codebase be rewritten without paying the social consequences of requesting a lot of extra work from their coworker. The AI coder will comply eagerly. This is a reason why the human with an AI-assistant has the potential to act as a very powerful reviewer if they want to, but not much currently guarantees that they will.
  5. Not everyone will do a good job of reviewing their AI-coded work due to skill issues, in addition to the possibility of plain neglect.
7 Likes

First question I would ask about that: is it a feature or a bug?

Seems a non-problem to me.

There are humans behind the opam-repository they are listed in the governance document. These people do take decisions, make and enforce policies.

You don’t need a “framework”. If you are unhappy about what these people decide then go build your own repo.

3 Likes

The policies document only contains pretty technical rules, there isn’t even
a strong “only free software” rule in there. Ruling on AI usage (beyond
banning obvious slop) seems like a totally different level of policing
the repo to me.

1 Like

Not really, there is a rule about package usefullness and another for naming which formally are pretty subjective subjects.

That’s just to say that in the end it’s up to the people managing the repo to decide what they want it to be, perhaps influenced via community input, or not (but of course what they decide could affect their funding support).

It should also be mentioned that you can be part of these people.


P.S. Just to make clear, while my name is on the list I linked earlier I do not participate in the repository’s management/policy.

3 Likes

That’s fair, you’re right. To me these seem pretty obvious but they are indeed judging what is an acceptable package. :slight_smile:

1 Like