It’s 2025, and it’s becoming harder to ignore that LLMs are here to stay, and are changing the way we are programming at large. Or at least, it feels like it.
For a while, I’ve been kind of a gaulois réfractaire myself, when it comes to adopting this technology. Over the month of May, I decided to change that, and at least try to integrate them in my daily toolkit. Since I write OCaml professionally, I picked an arbitrary project in a domain I wasn’t familiar for, to see the impact of LLMs on a more niche language (no offense ahah).
In particular, I got myself familiar with EIO in the process, and I wanted to discuss the following part of my article with the OCaml community
For a while, people learnt to become search engine friendly. They turned themselves into export of Google algorithm, etc. Will we do the same thing with LLMs? Are there guidelines? How can we “fix” at our level a situation where a LLM got a software library we wrote completely wrong?
Hello Thomas, many thanks for the very nice topic.
I am guessing that as a library author one thing that could help LLMs is
to have a very good documentation where one pays atention to give
examples in which the problem description is very close (in number of
tokens) to the program that solves the problem?
One nice side-effect of “hallucinations” which I noticed in my own projects, for the rare occasions when they do occur, is that when they’re in my own modules they usually turn out to be excellent suggestions of missing functions, coherent with my own naming and calling conventions. I’d say over half the time I end up creating it to simplify the calling site I was working on, or renaming a function it couldn’t find because its name strayed too far from my own conventions.
For less popular languages like OCaml, or cutting-edge libraries like Eio, I think I’d probably include the relevant third-party module docs in my context if the LLM doesn’t master it well enough. That’s where clients with full context control like Aider help quite a bit. For example I have 200K context tokens available with my Anthropic key, plenty of room for full files as context when necessary.
For now, I’ll do my best and say it’s “simply” a matter of context.
Objectively, if we can have an MCP server/tool for an LLM to read the generated docs of a library, perhaps it’ll do much better.
I tried and haven’t finished much because opam and ocaml.org doesn’t offer a way for me to easily query the docs of a lib by version. Easily worked around though by storing that data myself for the MCP server.
I’ll get around finishing up my experiment as I just wrapped up interviews.
That’s interesting, because when digging a little about cohttp-eio hallucinations I figure out that some could actually come from discrepancies with cohttp-lwt.
I need to learn how to curate contexts, indeed. Will deffo have a look at Aider!
@dangdennis let us know how it turns out! i’m pretty curious actually.
Zach Daniel from the Elixir community wrote a blog post on how to improve LLM’s usage of available libraries which in turn reduces low quality questions. See the “Crowdsource your context” heading in particular.