It’s 2025, and it’s becoming harder to ignore that LLMs are here to stay, and are changing the way we are programming at large. Or at least, it feels like it.
For a while, I’ve been kind of a gaulois réfractaire myself, when it comes to adopting this technology. Over the month of May, I decided to change that, and at least try to integrate them in my daily toolkit. Since I write OCaml professionally, I picked an arbitrary project in a domain I wasn’t familiar for, to see the impact of LLMs on a more niche language (no offense ahah).
In particular, I got myself familiar with EIO in the process, and I wanted to discuss the following part of my article with the OCaml community
For a while, people learnt to become search engine friendly. They turned themselves into export of Google algorithm, etc. Will we do the same thing with LLMs? Are there guidelines? How can we “fix” at our level a situation where a LLM got a software library we wrote completely wrong?
Hello Thomas, many thanks for the very nice topic.
I am guessing that as a library author one thing that could help LLMs is
to have a very good documentation where one pays atention to give
examples in which the problem description is very close (in number of
tokens) to the program that solves the problem?
One nice side-effect of “hallucinations” which I noticed in my own projects, for the rare occasions when they do occur, is that when they’re in my own modules they usually turn out to be excellent suggestions of missing functions, coherent with my own naming and calling conventions. I’d say over half the time I end up creating it to simplify the calling site I was working on, or renaming a function it couldn’t find because its name strayed too far from my own conventions.
For less popular languages like OCaml, or cutting-edge libraries like Eio, I think I’d probably include the relevant third-party module docs in my context if the LLM doesn’t master it well enough. That’s where clients with full context control like Aider help quite a bit. For example I have 200K context tokens available with my Anthropic key, plenty of room for full files as context when necessary.
For now, I’ll do my best and say it’s “simply” a matter of context.
Objectively, if we can have an MCP server/tool for an LLM to read the generated docs of a library, perhaps it’ll do much better.
I tried and haven’t finished much because opam and ocaml.org doesn’t offer a way for me to easily query the docs of a lib by version. Easily worked around though by storing that data myself for the MCP server.
I’ll get around finishing up my experiment as I just wrapped up interviews.
That’s interesting, because when digging a little about cohttp-eio hallucinations I figure out that some could actually come from discrepancies with cohttp-lwt.
I need to learn how to curate contexts, indeed. Will deffo have a look at Aider!
@dangdennis let us know how it turns out! i’m pretty curious actually.
Zach Daniel from the Elixir community wrote a blog post on how to improve LLM’s usage of available libraries which in turn reduces low quality questions. See the “Crowdsource your context” heading in particular.
I just came across context7 and getting your language or library included there seems like a good way of getting more value out of LLM’s while reducing hallucinations.
@sadiq and I are doing some similar experiments using the output from the docs CI that powers ocaml.org/p/* - the repo is GitHub - sadiqj/odoc-llm, though this requires that you have the raw output from the CI available. I could make the data public, if that would be useful? The aim here is to be able to have an MCP server that can query which packages you might like to use for a project - so you don’t need to have them installed locally.
Of course, once you’ve decided on your packages and got them installed, you might not get the interfaces that ocaml.org shows, as a library’s interface can depend on its dependency libraries (see my previous post for a description of why), so maybe we might end up with dune mcp that would allow for precise querying of your packages.