What are some libraries you almost always use?

Since @ostera gave his interpretation of this here. I’ll make that statement more precise as I find myself quite in disagreement with what he wrote under what I find to be the wrong lenses (or rules) to assess the situation.

I won’t comment on the docs aspect since I don’t find cmdliner’s docs to be particularly good – nor particularly bad either but I would certainly write them differently now that the feature set of the library has changed (more on this below).

However I’d like to address two impressions that come out of that article before they start to become myths. These are:

  1. The API is hard to use.
  2. You need to understand what an applicative is to be able to use the library and that’s a high bar for usage.

Regarding 1. I don’t think the API is hard to use. As far as I’m concerned a hard to use API would be an API that allows you to easily shoot yourself in the foot, that makes it hard to define, tweak or evolve your command lines, or makes it hard to understand what is going on when you come back to the code or when you have a bug. I personally find none of that to be true.

Regarding 2. You absolutely do not need to understand what an applicative is to be able to use the library. With the time I even evicted that fact from documentation and renamed Term.pure to Term.const. I think that any working OCaml programmer should be able to start from the short basics and gradually tweak that example to get to what s.he needs without ever having to understand what an applicative is by simply following the type mechanics.

So if it’s not 1. what it is ? One of the reasons why a lot of cmdliner code gets cut and pasted to be modified is that the API induces a lot boilerplate and that it became slightly messier over the years. Here are few reasons for why this is the case:

  1. It started simply as a cli parsing library. However over time, it gradually evolved towards an “os process” interface library. Environment variable lookup was integrated as well as a formalization of program exits and their documentation. This was bolted on top of the API without breaking it. This means that the current API is unlikely to be the best way of structuring and exposing the feature set. That of course hampers its usability.
  2. It failed to capture one important pattern that became widespread as command line
    tools grew in complexity over the last decade which is to have specific command lines syntaxes for tool object verb not just tool cmd. Many people, myself included, ended up manually encoding this pattern in an unsatisfactory manner too many times. (This PR is meant to fix that, but that may not help with streamlining the API).
  3. The Arg.t to Term.t mechanics could likely be streamlined by using less applications (in fact the new design I have in my head eschews it entirely).
  4. A few defaults that became clearer as more and more cmdliner programs have been written could likely be changed and/or integrated to cut on some of the boilerplate.
  5. The library was written 10 years ago and OCaml and its stdlib were different. Newcomers often evaluate program sources and designs with respect to the current state of the art rather than in the light of the era in which they were created; that leads to misunderstandings. For example there was no result type in the stdlib. Its integration both at the Arg.conv and Term level was added later, perhaps hastily, which muddled the API; nowadays it would certainly play a central role in the design, both for parsing and managing exits. Another example is the & operator which confuses a lot people but would nowadays simply not exist since @@ does, or not be needed at all (see 3.).

Finally to come back to the use of applicative which in the particular case of cmdliner was discovered, not applied. I have seen many people over time claiming that this was non-obvious, needlessly bureaucratic for “just parsing an array of string” or that what they wanted is just “simple” and direct Arg-like mutations to get their parse result (these people should be forced to go work on the ocaml drivers…). These thoughts largely miss the point in my opinion. I think the current design is a good one because:

  1. It scales. In a term you can encapsulate non-trivial cli interaction that eventually defines a whole immutable datastructure.
  2. It is composable. Your libraries can expose cli interaction terms to be reused.
  3. It precisely avoids the need for reference cells or mutations. These become tempting global mutable state or enable easy to do but hard to understand effectful contorsions; so it’s better to avoid them.

These points encourage you to have a good program structure where you define data structures and algorithms, and, cleanly separated from them, the cli interface and its logic to expose them to the shell.

22 Likes