OCaml Documentation Open Thread

Re. DocJam: I’m thinking of aiming for the week-end of May 18-19-20 (I’m including a Friday in case some people are more available during week-time), because it happens to be convenient for me and there is not much point in doing a poll at this scale. I’m leaving myself some time to think it again and for people to complain, but otherwise I can open a thread here and announce it on the caml-list and reddit.

What we need to get running (besides an agreement on the dates) is:

  • a document that describes the purpose of the event, the process, and how to join
  • a way to broadcast suggestions of libraries to look at; (if the organization document is on a wiki, we could have another page for these suggestions)
  • a communication channel (I’m thinking of staying with Discuss, but maybe also an #ocaml-docjam IRC with a link the Matrix bridge)
  • some agreed way to aggregate/collect a list of things that have been done during the jam, to write a summary post/description after the fact
  • people willing to participate

One thing that must be articulated carefully is the question of how to review the work produced by the jam. People may be uncertain about whether their documentation or example proposal is actually “the right way” to do it, and we need to liberate ourselves from this worry by addressing it upfront and having a process in place. The simplest thing to try would be to encourage people to submit an upstream PR and share the link on the Jam thread, to encourage others to review it (before, or in addition to, the actual maintainer(s) of the library). But maybe some people will want a way to pre-screen their submission proposals – maybe they could create a PR from their doc-branch to their own trunk-branch for fellow jammers to review, and only send upstream after that is done.

8 Likes

BTW, while I’m not so much a fan of Rust documentation, they have plugged in a way to have code examples in the documentation be compiled when generating the doc (IIUC), which provides a sort of regression test. Besides, these code blocks are also equipped with a hyperlink to the online Rust playground. I haven’t checked recently, but I don’t think these features are currently possible in OCaml documentation, are they? As OCaml is also an interpreted language, and as jbuilder utop exists, it may even be easier than in Rust to implement something like this?

3 Likes

Concerning the discussion about the tooling itself, I’d like to add some information that might not be common knowledge:

  • at Jane Street, we are currently experimenting with using elastic search to search the documentation generated by odoc.
    More concretely a --elastic-search flag was added to the odoc html command to generate a .json index that can directly be fed to elastic search. For the moment the output is very noisy because of all the aliases, includes, etc. that are present in the codebase, and people are working on adding a sensible ranking system. We are also wondering whether something from the rotor project could be reused to address this issue.
    All this is very experimental at this point but will be shared with the community if we end up with something usable.

  • to react to “rust has a way to have good examples in the documentation be compiled when generating the doc”: I think we’ll probably end up with something somewhat similar at some point. The idea has been floating around for a while now of having odoc understand toplevel expect test files, i.e. docstring would be parsed in the usual way, and all the code (and expect) blocks would be wrapped in {[ ]} and left untouched.

8 Likes

I’ve wanted to take a whack at ocaml documentation generation for a while, and I’ve come up with docre. You can check out the example documentation site for Reprocessing to get a feel for what it does.

Things that y’all might find interesting:

  • you can have examples that run in-browser (compiled with bucklescript) & are editable too! The Getting Started page has a few of these to play with
  • also, all code example blocks are compiled & run as doctests by default (you can opt-out per-block) so they never get stale
  • out-of-the-box search w/o relying on an external server
  • you can write documentation pages that aren’t connected to a module (as a markdown file) and they are processed (the Getting Started page is one such)
  • inspired by Elm’s documentation, you have full control over the order & placement of items within a module’s documentation page. See how the main module for Reprocessing has the values split up into sections. (see the source)

I know ocamldoc & odoc are there & powerful, but I wanted to see what I could come up with, and I’m quite happy with the result.

17 Likes

Thanks for chiming in, Jared! The search and the runnable examples (the types on hover!) of docre are indeed amazing. One particular feature I would really love to be in odoc generated markup would be the navigational elements (page sections and modules), which I believe is currently being addressed (among other things) by @antron.

1 Like

@jaredly We have seen lot’s of people talking about doc and the tooling lately, and you chiming in is the perfect ocasion to put my grain of salt.

There are (at least) two aspects to a good documentation generator in a language like OCaml:

  1. Make it nice, with lot’s of UX features such as inline search, re<->ocaml conversion, cool inline code block features (Your typing tooltips are super cool and the inline edition has been a feature wish of mine for a loong time, although with js_of_ocaml instead).
  2. Untangle the module spagetti so that functors, includes and signatures work properly and are linked.

The two tasks are hard and people working on one tend to severely underestimate the other. Ocamldoc does a poor job at both. Odoc is laser-focused on solving 2., docre only handles 1.

This is the perfect ground for cooperation! ocamllabs/jst has been funding odoc for more than 4 years now. I’m a lowly postdoc, so I can’t really fund anything, but I swear, if you make an odoc-powered generator that has all the cool features from docre, including one-button reason<->ocaml conversion like in the new ocsigen website, I’m paying a drink to everyone involved.

13 Likes

This is indeed nice - specially the search functionality. :+1:

:smile: Similarly I am in between my contracts but I will chip in too.

Thank you so much for working on this! This output is really nice looking! I have some minor quibbles with the typography (like multi-line headlines are a little too tight on the line spacing, and I suspect that the font weight is a little too light for visually impaired) but this really nails the overall look and feel! (I haven’t looked at it yet in mobile, is there a distinct set of CSS for that?)

As Drup mentions, it’s going to be important to untangle the module spaghetti as well, but if there’s a way to merge what docre and odoc do, I think OCaml documentation is going to look a whole lot better!

There are a mix of issues brought up here, some social and some technical.

The technical issues have mostly to do with things like visual rendering, search, and proper handling of module-system level primitives in the construction of interfaces. All of this is actively being worked on, much of it as part of a rework of the odoc toolchain being worked on at OCaml Labs presently.

Beyond the technical, there are of course elbow-grease issues; parts of the documentation that simply need more work. The people working on Base are committed to improving this aspect of the library, and indeed hiring a technical writer has already helped there, and I expect will help more over time.

Finally, there are differences in goals. The hash-table examples are instructive. The Ruby docs really come from a very different point of view than the corresponding stdlib docs docs. Ruby assumes that the reader doesn’t necessarily know what a hash-table is, and spends time introducing the basic concepts. The stdlib docs assume the reader knows what a hash-table is, and just wants to get down to the business of understanding the individual operators and what they do with a minimum of fuss.

There’s clearly room for both kinds of documentation. My personal view is that some intermediate point is ideal. I think stdlib-style documentation (and I very much include the documentation for Base here) should remain relatively terse, though not as terse as it currently is. I also agree that it should become more example-driven. Indeed, I’d love to have ways of integrating running examples in the documentation, which is something I expect will come as we refresh the tooling.

But I don’t really want to go all the way to the amount of prose in the Ruby docs. Which is not to say that such prose has no place. For example, Real World OCaml’s chapter on maps and hashtables is a lot more expansive, spending time talking about where these data structures fit in to the ecosystem and about when you’d want to pick one or the other. My ideal is to keep the stdlib relatively terse to make it effective as a reference manual, but to have complementary sources, like RWO, that go into more detail on the bigger picture concepts.

(if you want an example of really exemplary documentation, take a look at Racket’s docs. they’re beautiful, readable, and well written; and at the same time, don’t make you feel overwhelmed by prose, which is what the Ruby docs do to me.)

It’s worth saying that even from that perspective, sometimes we all agree that a more detailed approach is called for. For example, take a look at the documentation for Incremental, which is complex enough that a detailed manual-like description is clearly necessary.

In any case, this is all stuff we hope to make progress on in the coming month, and contributions are welcome! PRs to fix documentation are one of the easiest ways to get involved in a project, and there are lots of places that could use improvement!

y

9 Likes

I’ll point out an interesting feature of the Ruby docs as they stand: they work for both kinds of users! They have the quality of both being good for the rank beginner and for the expert.

Once they’re used to the documentation, experts go straight to the description for the feature they are interested in, skipping the intro at the start of the module page. Experts also (given the way the explanations are structured) ignore the more verbose material very easily because it comes after the call signature and basic information they’re looking for.

Beginners can read the rest, getting examples detail experts don’t need. Once you get into the rhythm of it, as an expert you don’t really notice the wordiness. However, as a beginner, you have enough to get a toehold. This is a neat feature.

Indeed, I’d say this is a feature of the documentation for several of the more popular languages out there, and although it isn’t the only reason they’re popular, it’s an important aid to getting beginners to adopt (and fall in love) with one.

Experts might get irritated by excess words, but beginners get stalled permanently by the lack of them. I’ve seen people give up on particular tools for want of a few sentences that make their use clearer to a beginner.

One of the goals in the long term is, I think, to make OCaml a much more popular language. To do that, one probably wants more explanation, not less.

Note that OCaml is harder for most beginners than a language like Python, so it is even more important to have good docs, not just good intro books. I would therefore suggest that, in the case of OCaml even more than Ruby, erring on the side of verbosity is a good thing, provided the detail is something an expert can skip easily and that the parts they wan’t aren’t hidden away.

(Heck, if it really is judged to be too much of an eyesore, one could even tag paragraphs and examples with “beginner” and conceal them with JavaScript when one doesn’t want to have them cluttering the page, but still have them there for newcomers.)

3 Likes

I tend to think the Racket documentation is actually quite detailed, though it made a slightly different decision about how to manage the detail. It has a distinct “Guide” and “Reference”, and thus puts more of the beginner material into the Guide and less in the Reference, but that material is still there and easy to get to.

Let me note the Racket documentation is quite beautifully formatted. I consider its choices nearly ideal, including the use of serif fonts at a large size (and with short lines) for the main font which enhances readability over sans serif body type, a highly readable typewriter font for code, excellent choices in colors (and in colored regions to guide the eye), horizontal lines to break up sections, a good left hand column quick navigation and index column, and more. OCaml could do worse than to steal many of the aesthetic choices wholesale.

There’s also a lot of examples, even in the reference guide. Pretty much every function has an example associated with it.

2 Likes

I’ve been contributing to odoc a bit lately, with the main goal of making it more contributable.

After some prodding the other day (thanks to @bluddy, and based on threads like this one, so thanks @perry), the plan is to write a useful README, contributing docs, etc., this weekend, and actually open up the codebase to the community :slight_smile:

@Drup is exactly right that odoc is very strong on handling the complexity of OCaml, and has a huge blindspot on output quality and overall usability — what docre is so good at :slight_smile:

In odoc, this can only be solved by good ideas from the whole community.

For anyone particularly interested, you are welcome to PM me. I’d be happy to give tours of the project and codebase based on what I’ve learned about it and done to it, explain its gaps, etc. Perhaps we can turn that into a talk at some point, or at least it will lead to posting some well-written issues :slight_smile:

8 Likes

This is wonderful! Thank you @antron!

By the way, any idea how hard it would be to merge the two sets of capabilities of odoc and docre?

Documentation is always a fun topic, here are my thoughts:

  • I always find documentation hard mostly because I’m not a technical author and as such it doesn’t come naturally. Maybe someone could post a guide to writing good documentation

  • It’s not clear in a library how the documentation should be structured: Function and module documentation is obvious but where does the overview and general examples documentation go? In the library.mli file? How do I make good use of the new MLD files? A guide to creating library documentation and interacting with the various tools would be great

  • Many languages have a standard for API documentation (even if it’s de-facto) with basic boiler plate for each type (intro, function, module etc) and a guide as to where to put these. I’d love to see something like this for OCaml as then I wouldn’t have to think about formatting or structure. We could also include some standard style around how parameters are reference in function documentation etc.

  • OCaml documentation is not easily discoverable as @dbuenzli noted about his own excellent documents: even when they are only one click away. Can we create a OPAM document web where all OPAM library documentation is available? Add a searchable module index that displays the modules found with the package summary and finding a specific module document will be much easier.

The format and structure of the documentation is definitely improving but as @antron notes output quality could improve, Here’s a few thoughts on this :

  • As @perry notes the Racket documentation is a great place to steal formatting ideas from. Also, make the pages mobile friendly, I suspect a lot of people read the docs on the train.

  • a quick way to switch versions of a given library when browsing the docs

  • I have a bit of a love/hate relationship with the function-index-on-the-left layout but it does make it easy to find a particular function and jump to it. This would also allow a large intro/examples section at the top of a module without getting in the way of the seasoned pro.

Finally, I love the idea of the DocJam

3 Likes

I will look and see if I can find one. That said, I’ve found for myself that there are two keys to this:

  1. Read a lot of good documentation to get a sense of what it is like. Surprisingly, a lot of programmers haven’t seen many examples, and this makes it harder to do your own because you don’t have enough of a sense of what your goal is.
  2. Write a lot. Like many other things (playing a musical instrument, writing programs, etc.), one key to being good at writing is to do it often enough. No one starts good, but the more you practice, the better you get.

So in summary, read a lot, and write a lot, and you’ll get better.

This would be really great. One complaint I didn’t make earlier, because it isn’t really the fault of software authors: when you go to Github and look at an OCaml library, the documentation is naturally in source form, but you want to browse the finished output to get a sense of whether you want to use the thing. If there was an OPAM document library online, it would be ever so much easier to have the formatted documentation online without needing to maintain a web site just for a small library.

There shouldn’t be any hard technical obstacles in the long term, but it would take a bunch of non-trivial work, the volume of which should not be underestimated :slight_smile:


This is the OCaml Labs plan for docs (cc @avsm), and it’s one of the “wider” goals of the odoc project. It should explain partly why some of the odoc output looks how it does (e.g. module names hidden behind the package index, though that might not be the right way to do it).

It’s a big part of the reason odoc needs an excellent cross-referencer, so we can link between types and values in all these libraries, even when the source or target does pretty complex stuff with modules.

This should help get the majority of the libraries documented online.

For bigger libraries that need some kind of finer control over their docs, we can maybe support them with either lots of flexibility for customization, or some kind of combination of predictable URLs+redirection. Any ideas about this are very welcome :slight_smile:

1 Like

A centralized documentation website is certainly useful (especially for online discussion).

However I still wonder about all these people that complain they can’t find the documentation of the libraries they use: odig generates the documentation of any library you install via opam and that respect its convention (which any jbuilder or topkg+ocamlbuild package automatically satisfies). It also gives you direct access to the release notes (odig changes) and readme (odig readme) of the packages.

Granted there’s no module index yet (though the next release of odig will allow you to odig doc M) as odoc doesn’t generate indexes yet. But again if you know which packages you use, accessing its documentation is as simple as odig doc PKG.

A side note: for me, a problem is finding the docs for the libraries I don’t yet use. I often see something exists in opam and/or github but I don’t want to bother with it until I’ve verified it does what I need, but then I need to build it just to get to the docs. :frowning:

8 Likes

No one has mentioned it, but it might be worth also looking at Owl’s documentation, which is built using Sphinx from formatted text embedded in ocamldoc comments in Owl, also using source here.

Note that the look of the Owl documentation varies. In browsers on my Macs, there is an all-white background with an “Old Style” typeface–maybe a Garamond. In Safari on an iPad–this is probably the mobile device look–there are black page headers with what I think is what’s known as a “Transitional” font. The point is that you’ll notice the difference in the overall look on different devices, and if you don’t like one, you might like the other. Presumably this is all configurable.

@ryanrhymes would know more about how the Owl docs are set up.

I should add that the Owl docs are a work in progress. Many functions still have signature-only documentation, but the intention, I believe, is that all functions will eventually have descriptions. This is one area where I am contributing when I can. Also note that there’s a separate tutorial overview document in the same documentation tree to supplement terse function docs. This is one variant on the strategies discussed earlier in this thread. I think that the doc build process is slightly awkward at this point because you have to build from the Owl Guide tutorial source which then pulls in from the Owl source, so you can’t build the Owl docs without also downloading the Owl Guide source. It’s not a big deal, though, and I’m sure other ways of doing it would be possible.

3 Likes