Why the OCaml ecosystem is poorly documented?

I have several theories as to why there’s a lack of documentation in OCaml, what are yours?

  • lack of comments that don’t need to be closed (e.g. //)
  • most of the code is written by academics
  • comments are poorly formatted by ocamlformats and often look ugly (I tend to delete them just because they break the flow in OCaml)
1 Like
  • The code is clean and self-explanatory
  • Types help to understand meaning
  • .NET probably is not much better. My colleagues complain very often that .NET is poorly documented.
  • We can’t objectively measure quality of documentation, because the only objective external metric for software is performance. For any well-documented project we can find a guy, who can’t understand it.
  • Maybe the claim is just false?
2 Likes

If we’re looking for things that support the proposition “OCaml ecosystem seems to be poorly documented”, I think there are stronger to-be reasons (that have recently seen a sizeable amount of work!):

  • a lack of a central documentation explorer and search tool (being solved by V3 ocaml website and sherlodoc)
  • many popular libraries and utilities in the ecosystem were originally dumps of internal proprietary code on github (now seeing lots of care on the documentation front)
  • ocaml itself is constantly improving its own manual with soft documentation, examples, API-search, etc… Few years ago that wasn’t the case

Hopefully the situation keeps improving! We’re in somewhat of a flux currently, but definitely at a much better position than in the past, when documentation scarcity was notorious.

9 Likes

The biggest factors in my opinion are the culture, the number of users of the language and most OCaml devs are very capable.

This factors get into a set of problems that you might face directly but aren’t the root cause. The culture of the OCaml and being capable comes with a few “believes”:

  • “types are documentation”/“code is self explanatory”
  • “Everyone” does the stuff that they are working on and there’s no need to share knowledge/docs
  • Many reimplemenations of the same problem and rarely updates on those, or lack of cross-collaborations.
  • Many projects became abandon, they were born as a single dev pushing for OCaml in their company/project, the dev left/the project got canceled and no-one maintains the project anymore.
  • Many great projects are created following academic principles and adapted to industrial usage.

And a long list of anecdotes where the culture isn’t on the prime of teaching newbies to the language. Which, is being pushed against from ocaml.org / Tarides / other big players (maybe not JS libs), where they are working hard to change this narrative from OCaml.

The ppxlib manual, v3 documentation, opam registry are great examples of that.

7 Likes

I doubt that a lack of documentation is a technical problem. The points in the OP seem to be talking about inline comments in .ml files, which I personally wouldn’t even consider “documentation” most of the time. The kind of documentation I miss the most include example code, “getting started” guides, or just general explanations of each library’s/function’s uses and limitations.

I have no theory on why the ecosystem has (historically) lacked that kind of documentation, other than that the culture just accepted it. Although, IMO the status-quo has greatly improved recently.

4 Likes

I think everyone can agree that documentation visibility and discoverability has been an issue, but as @hyphenrf said, there have been big improvements there of late.

Beyond that, I guess I’d suggest pointing out places where documentation is poor? Leaving aside abandonware or obviously hobby projects, where is documentation lacking or of low quality? I can’t personally think of a clear example off the top of my head; and without getting concrete, generalized complaints are probably not very productive.

4 Likes

Ppxlib documentation used to be quite bad, but it got a big overhaul recently

1 Like

The entire stdlib documentation is devoid of usage examples. That’s one pretty glaring omission when comparing with other documentation sources I’d consider good.

7 Likes

This has been improving https://github.com/ocaml/ocaml/pull/11476. PRs welcome! :slight_smile:

Cheers,
Nicolas

6 Likes

Ah, one of those topics again (why is OCaml bad at X). There is certainly a feeling that the OCaml ecosystem is badly documented, is it true ? I’m not sure since there is no clear metric, but this is certainly something that people love to spread. But certainly there are aspects that could be improved or discussed. But first:

  • Lack of // style comments. I like those, so yes, but I don’t think they have anything to do with documentation in general. In Java, user facing doc seems mostly written using with javadoc syntax in /** */ comments (non nested mind you). Likewise for Python, the way to comment module, function, classes is with docstrings that are, well, strings so “”" delimited.

  • Code is written by academics. We can find anecdotal evidences. If you take OCaml’s standard library or e.g. Menhir (both written by academics, at least initially for OCaml’s stdlib documentation) I don’t think they fall behind any documentation of libs or tools of other languages. On the contrary you can find in the ecosystem libraries written by people from the industry and that are completely lacking. And you can of course find the converse (industrial libraries that are well documented and academic libraries that are badly documented).

Now things that should be improved and that are being worked on (I think):

  • simplifying the workflow of doc generation to its utmost. A programmer should be able to see the final rendition of the doc of its .mli file without having to jump through any hoops. In that respects, dune build @doc is a tremendous help. One difficulty is technical I think (and is being worked on in odoc). OCaml libraries tend to rely heavily on functors, includes, aliases and so on. From a programmer perspective, you want to put your comments where the code is (say in file a.mli). But if module A ends up being passed to a functor, itself included and re-exported in the user facing API, you want the doc to appear there naturally (i.e. as if it were written there to begin with). I recon other languages don’t have that level of difficulty to cope with.

  • Critical tools such as opam and dune seem to have a policy of “move fast and don’t break things” (which is good). Dune has the (lang dune <version>) which helps maintain dune files and evolve them gradually. For opam there are several announcement whenever breaking changes may occur (like for the deprecation of 1.2). The “move fast” part is a bit more problematic, it takes a bit of time to discover new features and for them to be documented properly. For instance I don’t think (i.e. I could not find) the ocaml-option-foo way of installing variants of the OCaml compiler with opam is documented somewhere (except in post on discuss or stackoverflow). For dune it seems that (include_subdirs qualified) still appears as “future work” in the latest dune documentation (while e.g. map_workspace_root is correctly referenced, both features were added in the recently released 3.7 it seems). We could also say the same for jsoo where I don’t think the latest options are completely documented (I could be wrong). In all cases the problems are known, I think. The size of the dev team is relatively small, and writing documentation for tools is really different from writing docstring style comments. It’s a project in its own right where you cannot just follow the shape of the code and annotated where needed (you can do that to some extent for command line options which gives you man-pages automatically, but that’s far from sufficient).

1 Like

I guess these answers made me think of two more:

  • the culture which is really “what works/ed for me should work for everyone” (as clearly some of the replies here are in complete denial)
  • the fact that documentation is often split from the implementation, as it’s in a header/interface file (.mli)
  • easy generation of doc (dune build @doc doesn’t work on my project and I could not pinpoint why, and even when it works it’s pretty ugly :D)

I think the culture is easier to fix if more people are vocal about it. But the fact that code is split into two files (ml/mli) does add a lot of friction/overhead already and might be a big contributor as well (and I guess we’re not moving from that)

I think on contrary that the mli situation makes documenting easier. You write an mli automatically using Alt + O on vscode, and then you make just have to write the doc, on which immediate feedback as the mli file already looks a bit like the doc.
Otherwise, I believe dune build @doc should work most of the time, you should probably explain your issue here or on discord so that people can help you fix it.

6 Likes

That’s a fair suggestion, though I wonder if you’re completely discounting the OCaml manual, which is rife with examples? Obviously there aren’t chapters dedicated to each stdlib module, etc., but it is a great resource.

IME, granular library- (or module- in this case) level examples really don’t belong in an API reference, but in a separate repository (as the former is tautologically a reference, while the latter would fall under some tutorial/learning umbrella). A lot of the content @ Learn OCaml falls under that category. Of course there can/should always be more.

1 Like

I really don’t think that’s fair. I don’t see anyone saying “Everything available is sufficient for me, so there’s no problem”. If anything, there is recognition of there being gaps, but also a lot of questions about exactly what you’re referring to when making as broad-brush an assertion as “the OCaml ecosystem is poorly documented” – when, in fact, many parts of it are laboriously documented.

I think everyone appreciates experience reports, and it sounds like you might have some stories to tell, e.g. “I had trouble with this library/tool/whatever, because A, B, and C were undocumented”. Short of rolling up sleeves and filling whatever holes in documentation yourself, that’s probably the best way to make a substantive, actionable critique.

1 Like

No, I mean even more fine-grained examples, for each and every function. To augment the written description with a form that many people find much easier to understand. That’s something I think belongs in reference documentation. See for example the API documentation for Rust, Elm, Elixir, clojure, Kotlin, Go, JavaScript, and even .NET.

Tutorials and such showing bigger examples are great too, of course, but they serve a very different purpose, as you rightly point out.

5 Likes

Do you know about the whole ocamldoc syntax? You give 0 details about what you are unhappy about, but I’m assuming from your messages that you include code in your comments. If you use the right ocamldoc syntax then it won’t get broken by ocamlformat. Have a look at [ ] or {[ ]} or {v v}.

More details in the ocaml manual OCaml - The documentation generator (ocamldoc)

“In OCaml” or “the ocaml community” doesn’t mean anything. Could you actually pinpoint things you are missing? It helps to prioritize improvements accordingly.

open an issue on the odoc repository with details of what is “ugly”? There are already some unofficial css used in some projects. There’s also a style used on ocaml.org for the new documentation pages, for example with base base v0.16.3 (latest) · OCaml Package. But if you leave the guessing to people who you think don’t do a good job at providing what you need how do you expect things to get better?

There’s a large effort to improve documentation, tooling, accessibility, … for a few years, so there are clearly some people who are willing to make things better. But precise feedback helps. While vague ranting doesn’t do much?

5 Likes

Sorry for the double post.

You might have a try with odig as an alternative to dune build @doc. I very seldom use it. But maybe it works for you. There’s even a way to switch to a different style.

https://erratique.ch/software/odig/doc/manual.html#odoc_themes

2 Likes

Just as a bit of a point of order: your Clojure link is not to the actual API reference, but to clojuredocs.org, a community-maintained repository of additional documentation layered on top of the official documentation. (I mention this only because I had some hand in helping clojuredocs get started many years ago, where it was a community reaction to the core dev team explicitly not wanting to have the sort of expanded sets of examples and cross-referencing that you and others are looking for in OCaml.)

I’m not sure if you meant to imply that per-function examples are universal among widely-used languages, but it’s worth noting that that’s not the case, viz. Java, Python, Kotlin, Clojure, etc. I point this out only to indicate that, even in languages that seem to attempt to have granular examples, it’s very much a best-effort priority, rather than a strict necessity.

I do see the value in having such examples; having them available for OCaml libraries would I’m sure be helpful. The big questions are, who’s going to write and maintain those examples, and what does the infrastructure look like for maintaining them (i.e. having them in a testable form so their validity can be checked and expected output can be produced automatically).

3 Likes

Just to make that a little bit more precise.

odig is not an alternative to dune build @doc, it generates documentation for the opam packages you have installed in your opam switch (or in a package system install).

It won’t generate docs for your project – unless it’s an opam package and you install it in your opam switch (e.g. by pinning).

But yes as @Khady mentioned it does have a theming system so if you are unhappy with odig’s stylesheets or odoc’s one, you can devise your own and even distribute it as an opam package for everyone to use. Here’s how odig discovers stylesheets.

(Aside, the theming stuff should rather belong to odoc itself, see this issue).

1 Like

Not at all, I was just pointing to “good” examples. Other languages that do not have these kinds of examples include Haskell, Scala, and I’m sure most languages that I did not list.

Absolutely. This was mostly just addressing your previous question, that it was difficult to identify points of improvement. Then to actually do something about it is of course much harder, and in this case in particular requires not just writing it once, but building infrastructure to maintain and ensure it stays correct. With such a scope, it seems to me unlikely that it will happen by way of one or more drive-by PRs, and would benefit from centralized coordination.

3 Likes