On documentation, names, and the like

I see an extremely vehement argument here for a particular definition (out of many definitions, by the way) of equality. (It’s not a very good definition, but I’ll not pursue that since it’s irrelevant.)

What I don’t see here is any vehemence about making sure that documentation successfully conveys information to another human being.

Documentation need not be “philosophically correct” to be entirely successful at its central purpose, which is explaining something to the reader, and it can be “philosophically correct” and an utter failure at its central purpose of explaining things.

I see here a very large amount of concern for why a particular word is correct or wrong, but no concern whatsoever for what the best possible explanation for a reader would be, and how to deal with potential confusion.

By the way, the issue isn’t the word “is” or “returns”. It’s that all this documentation did was reiterate (as you yourself stated, with actual admiration!) the contents of the code. One can doubtless come up with some much less idiomatic phrasing that uses arbitrary politically correct terminology and yet explains the content — but I have seen no attention devoted here to the problem of clear explanation or any concern about the report that the explanation was puzzling to a reasonably experienced reader.

We are both puzzled :wink: What confused me is that a reasonably experienced reader need to read the code after reading this documentation (they say the same thing in two distinct languages), and need time to understand what it does. For a beginner I can understand, but for an experienced reader that’s more difficult. The first thing I learned in OCaml (around 20 years ago) is that a sentence as “let y be the half of x” is translated to let y = half x.

I realize I made a terminological mistake in my previous comment : negate returns something (the negate version of the proposition passed as argument) but negate pred val returns nothing but is not (pred val). Hence your first sentence " negate takes a boolean-valued function and returns a negated version of the function." is right, but not the second one.

By the way, the type of negate : ('a -> bool) -> ('a -> bool) say that it takes a boolean-valued function and returns another one, and the documentation [negate p v] is [not (p v)] that this is the negated version (as I expect from its name).

So, I don’t see why the doc should be written like you want (the fact that you did not understand it is not a sufficient condition).

Could you explain what really puzzled you when you read the doc for the first time?

@perry, I think everyone here agrees that you found the documentation confusing, and that it is a problem to be fixed. The disagreement is mainly on the technical way to fix that confusion. That’s why it would be very interesting to know a bit more about why you found some documentation confusing (for instance, concerning the negatefunction). To use your analogies, currently, it is a bit akin to a restaurant where you don’t find the food good, and require the chef change it to be good, but don’t provide more details about why you didn’t like the food (to spicy ? not spicy enough ? wrong consistency ? etc…). So what exactly was confusing you in the original documentation ?

You say you “couldn’t make heads or tails of [the original doc] for almost two minutes”, is it because of the use of is (compared to return) ? because the style of documenting the semantic of the function as code perplexed you ?

Now, about how to fix the documentation to make it less confusing. Your point seems to be that the documentation string for each individual value should carry more information (correct me if i’m wrong), whereas other ideas suggested here include teaching people the main concepts with which to read and understand documentation, and adding some more context information in module preambles to cite a few.
I personally tend to agree with the original documentation for reasons I’ll explicit.

In documentation, I think it is important to be attentive to the length of documentation, because readers have a limited attention span. Too long a documentation can be as bad as too little documentation for many reasons: the reader might stop halfway and decide she/he will re-implement the needed functionality (particularly in the case of trivial function such as negate), or the pertinent information can be hidden inside a lengthy text, which is what I think happens with your suggested documentation for the negate function.

In your suggested documentation you would have:

val negate : ('a -> bool) -> 'a -> bool
(** negate takes a boolean-valued function and returns a negated version of the function.
     Thus, negate pred val returns not (pred val). For example, this could be used to
     reverse the sense of a function passed to List.find or List.filter. **)

The first phrase is actually a repetition of the type of the function; it does not bring anything more that the type just above, which I believe might be a waste of the reader’s attention: the type is the second thing a reader sees when looking at documentation (first is the name, third is the docstring), so repeating it in the docstring is not very useful, except in the case of very complex types such as functions from the Format module. Here, the type is simple enough that it shouldn’t need an explanation. The reason it shouldn’t need an explanation is that there are a few pre-requisites of reading documentation, namely understanding the language, because the documentation for a package can’t and shouldn’t be expected to explain the language. Concretely, this means that documentation can expect readers to be able to read and understand simple type expressions. Thus repeating the type in the first sentence of the docstring is superfluous. Maybe this is useful in other languages where types (or their absence) do not give as much information, but in ocaml, this seems like wasted attention span.

I also find the third phrase is also problematic. The negate function actually does not have a very particular intent (that is, it is quite generic and isn’t aimed at very particular situations in my opinion), it is mainly a convenience function that might be useful in a wide variety of cases. Thus, stating an intent for the function, while potentially useful for newcomers who might be in the particular case described in the doc, might also be confusing to people who then might ask: if the intent is as described, and I’m not in that case, is using this function correct ? Basically describing the intent by specifying a special case is useful in tutorials of teaching materials, but it might be too restrictive in documentation.

Lastly, for an experiences ocamler, your suggested documentation hides the pertinent part of the documentation inside two longer sentences, which might make the documentation confusing and less informative.

So basically, the point here is that, just as doc can be confusing for you, your suggested improvements might make it confusing to other people (a fact that can’t be dismissed If I read the conversation correctly, ^^).

Basically, I’m saying the same thing as @kantian in much more lines (that’s the downside of answering in parallel, ^^).

I’m not sure everyone agrees it is a problem to be fixed. If everyone did, then it would just be a matter of finding the best way to fix it. I think there are some people who believe it is not something that requires fixing. I may, of course, be mistaken. If you like, of course, you might try proposing your own alteration to our example, just as an exercise in trying to figure out how to fix a documentation problem.

In this particular case (and lets not focus on it too much, there are many such cases), I found the content baldly correct but entirely uninformative. I couldn’t discern what the intent was of the creator of the function. Why did this thing exist? What was it for? The name was no help, “negate” could mean any one of dozens of things. Clearly this is a tool I might want to use, but under what circumstances should I remember that it exists so I might decide to use it? (It is useless for a library call to exist if no one ever remembers it.) Was this call meant to be used as a way of constructing a function that would be passed on, or was it expected that function would be applied immediately (as it was in the confusing example given in the one sentence of documentation).

Now, perhaps I’m very thick compared to many experienced OCamlers. Indeed, I would argue I probably am, and I mean that quite seriously. On the other hand, I also quite seriously believe I’m more representative of average working programmers than most people at the top of the OCaml community, who are unusually intelligent.

My own version of the documentation attempts to correct the specific issues I had. It might be inferior to yet other versions that could yet be constructed, especially ones with an explicit example (I just hint at what might be an example), but it feels at least a bit better. My version is intended to explain that the thing yields a function that can then be passed to other things (perhaps functions like filter), and that you would want it because if you have a function that returns the opposite of what you want, it will give you the correct one. (An example might make things even better, say showing that you could get the even numbers from a list given negate and an odd predicate function.) Note that you clearly want to pass the negated function if you’re reaching for this thing — if you just want to negate the output of odd once, not will do just fine.

Is this necessarily the “best” documentation? No. Is it better? Probably. Might it be made better still, possibly without adding much or any length? Possibly. It might even be possible to make it shorter again while still correcting the problems in the original. I don’t claim my version is perfect or a paragon, just a bit better.

I doubt that three sentences instead of one will greatly inconvenience even the most experienced reader. Many working programmers use languages where documentation is much more verbose. Indeed, almost all working programmers do, and yet I have very rarely heard people bitterly complaining about something being several sentences too long. Instead, people almost always bitterly complain of wasting six hours because the documentation for something was several sentences too short.

Redundancy is important in human explanations. Note, though, that the type does not explain that the returned function is negated, and the easiest way of doing that is to reiterate the type in the explanation. (I suppose in Coq or Idris a dependent type alone could convey everything, but this is OCaml.)

To you, perhaps it is not useful. To me, it is quite useful. Unless you think it would actively harm the comprehension of some readers, I think a little redundancy absolutely improves comprehension. Recall that humans often fail to notice individual words, or are tired, and in any case are not perfect. Also recall that you are not everyone. Other people’s comprehension process might not work as yours does. In particular, mine does not. And again, I don’t claim to be brilliant — I claim the opposite in fact.

I find it quite necessary for understanding. If I cannot picture what something would be useful for, I cannot really understand it. Different people have different styles of understanding. Some are visual thinkers, some are auditory thinkers. Some people literally have no visual imagination (and if asked, literally cannot picture an object and rotate it in their mind), some are hard pressed to learn something unless you describe things visually, some need to hear the explanation in their heads. Some people work better with abstraction, some people require examples to understand, some (like me) need to abstract things from their own examples to form a category in their head.

Experienced educators attempt to explain the same concept in several different ways so that people who have different understanding styles will all be able to understand what is being conveyed. For myself, I can visually imagine quite abstract things (say, infinite dimensional Hilbert spaces — yes, I have a “visualization” of those), but only given a set of quite concrete examples to generalize from. (Indeed, oddly, I work best with abstractions, but can only form those abstractions myself starting with concrete examples of a class.) Others require other forms of assistance. There is no one right way to understand things, and one cannot tell someone their way is wrong if it works.

Again, it is not up to the author to say “but you should not be confused by this description”, the author must simply accept that they may not think about something the way a reader does. Redundant, somewhat different ways of conveying the same information generally work best, but the only way to learn for sure is to see how people react to what you wrote.

Careful selection of words (“can be used, for example” and the like) can obviate such problems for most readers. Perfection is of course not possible, but carefully paying attention to reports from readers can usually get a document to the point where almost everyone can understand it well, and to the point where it is generally useful to both newcomers and relatively experienced people.

Quick answer (I’ll answer precise points later), but I think one of the main points is disagreement we have is the intrinsic purpose of documentation, or rather in this case, of function docstrings.

If I’m reading correctly, you think the docstrings of a function should explain the function, whereas in my mind it should be a description of the function (its semantics). This can be seen on the negate example where you try and explain how, why and where to use the function, all in the docstring.

I think these explanations (about the intent, on how, when and where to use functions) are better suited for READMEs, tutorials, module introductions, teaching material, etc… rather than individual docstrings.

Maybe discussing that point would solve a lot of disagreements on documentation.

1 Like

Documentation should indeed explain the software and APIs they discuss, yes. I believe otherwise the document would serve no purpose.

If the documentation doesn’t inform someone of anything they can’t learn by reading the code, why have documentation at all?

I can’t think of a language which has achieved widespread use whose primary documentation operates on the premise that it should be as parsimonious as possible, and not explain things like library functions but only state their semantics with as little added as can be achieved. I think there’s a good reason this is rare if anyone strives for it at all: if you actually followed such a rule strictly (which OCaml’s manual largely does not do either, it often explains things, examples on request) then you would never get any users for your language because, even if language was truly wonderful, no one would ever find out because everyone would give up in advance. Language creators who follow such a strategy are lonely, with few adopters.

Even quite dry documents like the ISO C standard (which I’m quite familiar with) take pains to explain the things they are discussing and give examples, because otherwise no one would ever understand what was intended. (The ISO C standard is full of examples in fact, even though it is a standards document, because without them many important subtleties would be incomprehensible.) The ANSI Common Lisp standards document, another example that comes to mind from long familiarity, is also rich and explanatory even though it is a specification.

If a standards document, which is certainly not a language tutorial, can give explanations and examples and even quite careful rationales for design decisions, then so can anything.

I don’t think a valid goal of documentation is to be minimal. The goal is to be informative, not to win a contest for terseness. There’s a good reason for software to be minimal — among many other things, simplicity and correctness are valuable and make maintenance easier. Documentation is not, however, software. It is almost certainly true that documentation could be overly verbose, but this is such a rare phenomenon that I struggle to name instances in my memory that I’ve encountered. Almost always real users complain (and complain bitterly) that the documentation they need is not there or is incomprehensible, not that it is too long. I’m sure you can find an exception or two, but they’re not common.

I see no motivation for striving for minimality in software documentation. I’ve seen it claimed several times in this and related discussions that if documentation explains too much it becomes difficult for working programmers to use, but I struggle to recall an instance of a reference manual I couldn’t use because it explained things too much, and I’ve been programming since the 1970s and have probably used 30 or more languages over that period.

I will also note that, almost invariably, the most popular languages, even ones that are truly terrible, have excellent documentation. The documentation for Perl, for example, is extremely thorough and easy to read, and is beautifully clear. Perl itself is horrible in nearly every way, but I’m fairly convinced that the language became popular, where many better ones didn’t, because people could actually learn it by reading the documentation. I was forced by circumstances to use Perl for many years, and I cannot once recall thinking “if only this documentation said less and was shorter.”

You’re supposition is not founded, I was more confused by your proposition than by the original docstring. And you can’t negate that this is inconvenient to me, only because it is clearer to you. :wink:

By the way, with this very trivial example I do not really need a docstring, there is only four pure functions with the type ('a -> bool) -> ('a -> bool):

let identity : ('a -> bool) -> ('a -> bool) = fun p x -> p x

let always_true : ('a -> bool) -> ('a -> bool) = fun _ _ -> true

let always_false : ('a -> bool) -> ('a -> bool) = fun _ _ -> false

let negate : ('a -> bool) -> ('a -> bool) = fun p x -> not (p x)

hence, from the name I know which one is implemented and this is confirmed by the docstring. With yours, the essential information is lost in needless ones (the repetition of what the type say) and with a verb that hide one of the principle that is at the heart of functional programming: equational reasoning.

1 Like

This is what you’re saying, translated: “I immediately see the the only possible implications, I don’t see why anyone else needs documentation, look at how simple this is.”

In other words, you are persisting in attempting to argue that the fact that someone is confused is their fault, and that the documentation is already perfect.

Not everyone is you.

At the point where someone is forced to consider what the four possible functions are, and then figure out what the intent of the use of that function might be, what you’re doing is sudoku or kenken, not reading the documentation for a module to familiarize yourself with the contents. Note that I figured it out myself pretty quickly, but it wasn’t something I should have had to figure out. Documentation is about explaining, not about setting a challenge to the reader. It shouldn’t have to be work, there’s enough work in our lives already.

Not everyone is in the mood to puzzle things out. They read the docs for the APIs they use so they’ll know what’s available and how to use them, and not for intellectual stimulation. They often read documentation under harried circumstances, like on the bus home from their office, or in between bites while eating lunch. They are not seeking a puzzle at that point, they are seeking to quickly scan through a bunch of manuals so that they can become better at using a tool they have to use to get their work done.

Sometimes, people are also looking through a manual for reference purposes. They’re reading someone else’s code, and they want to quickly understand what a function does, or they suspect there is a standard way of doing something and they want to find it quickly.

Many programmers also might not be particularly brilliant — many working programmers aren’t. I have taught many people who are not, in fact, particularly brilliant, and yet they too have a right to learn things and to function in their profession. The world requires millions of programmers to get its work done, and the bulk of them will not have PhDs or even be capable of getting them. They still want to get their jobs done, and indeed, it is in your interest that they be able to get them done, as you yourself depend on the product of their work. Bugs they introduce might one day overexpose you to radiation while you’re getting a CT scan, or might accidentally overdose you with drugs from an infusion pump, or might crash the plane you are traveling in. It is in your direct interest not to be overly contemptuous of people whose work impacts your life.

Some users of documentation might also not be at their best while reading. They might be tired, their two year old having been sick with the flu all night, or they might be low on blood sugar, or they themselves might be recovering from the flu but on a deadline, or they might otherwise not really be in the mood for enumerating the possibilities from the type signatures and then considering how those possible return functions might be applied.

Again, you argue that you would simply look at the types and have no need for such assistance, but again, not everyone is you. You are fortunate enough to be very smart, always alert, always in the best of health, and perpetually young, but not everyone is like that.

The world works best when we have sympathy for the plight of others rather than ignoring them. It is best when one considers that perhaps others are not as fortunate, or not as smart, or not as skilled, or not as healthy, or not as able bodied, etc. An architect who said, “I don’t see why someone needs a ramp to get into the building, I walk up the stairs just fine” isn’t a particularly kind or considerate person — and indeed, such an architect is also hurting himself.

However, I have a proposal. As you’ve said having the documentation that others might find informative hurts you, and you can get everything you need from looking at the types, there’s a very simple solution here: you can just not read the documentation paragraph, and just look at the types. Nothing forces you to look at the documents. The rest of us, who are not as smart, and who do not wish to enumerate the possible functions associated with a given type signature but who just want to get our work done, could then proceed to read something intended for more ordinary people.

1 Like

I think you’ve misinterpreted my intention. I just wanted to tell you in which state is my mind just after reading a type signature like negate : ('a -> bool) -> ('a -> bool), and so why your supposition “that three sentences instead of one will [not] greatly inconvenience even the most experienced reader” is wrong, at least in case similar to the one we discuss.

I do not need to figure out for a long time what the possibilities are, it’s immediate, and sometimes (and even most of the times) there is much more possibilities than four. So, after the function signature, the next information I want is: which one it is, not a sentence that paraphrases what I already know (the function type).

You could have also noted, that I’m no against adding examples if this could made the things clearer for the working programmers (I’m fairly aware that most programmer are not me). What I dislike and what confused me is that you totally removed the original docstring.

I’ll be fine with a docstring like this one:

val negate : ('a -> bool) -> ('a -> bool)
(** [negate p v] is [not (p v)].

Negate is the logical negation applied to predicates on some type `'a`.
For instance, if `even : int -> bool` tests if its parameter is even
then `negate even` tests if it's odd.

`negate even n = not (even n) = odd n`

You could use this function to negate a property passed to function
such as `List.find` or `List.filter`.

`List.filter even [1; 2; 3; 4] = [2; 4]`
`List.filter (negate even) [1; 2; 3; 4] = [1; 3]

@perry, I am not sure what you are trying to achieve with such adversarial argumentation. You should not be so quick to dismiss the point of view of others : I assure you that @kantian is totally sincere when it says that from him (and many people with a mathematical formation) the semantic of negate is totally obvious from its type (after all there is only one element of order 2 in S₂).

Going back to the documentation for negate,

negate p v is not(p v)

is a sweet spot for lot of people talking in this thread because it is both simple and exact.
Your extended documentation loses this simplicity, in particular by introducing a new external module.
It seems that you think that your documentation is obviously much better, but as this thread has shown it is not the case for many people. Rather than arguing on the fundamental nature of documentation, you should focus on finding a better compromise.

For instance, starting with your documentation

negate takes a boolean-valued function and returns a negated version of the function.
Thus, negate pred val returns not (pred val). For example, this could be used to
reverse the sense of a function passed to List.find or List.filter.

I have four remarks:

First, it starts with documentating the type of the function, not the function itsef
Second, val is a keyword
Third, returns implies that there is a heavy computation, potentially with some side-effect.
Fourth, the reader should not have to know List to understand the examples.

Thus, I would rather start by a description of the function:

negate predicate v is not(predicate v)

The description is certainly a bit dry, we could describe more precisely the function with

negate predicate is the predicate that is true if and only if the original predicate is false

Maybe some simple examples are warranted

For instance, negate is_odd is is_even

Combining all piece together, maybe

negate predicate is the predicate that is true if and only if the original predicate is false: for any value v, negate predicate v is not(predicate v) .
For instance , negate is_odd is is_even.

could work.

I am sure that this documentation is quite imperfect. The use of if and only ifis for instance quite problematic. I think I like better @kantian propositionf for instance. @perry, would you be so kind to comment on how you would improve it without lecturing me?


It was not intended to be adversarial. Initially, I thought I was stating a point which is retroactively obvious but not necessarily prospectively obvious, which is that the test for whether documentation is good is not whether you (likely as the author of a piece of code) find it sufficient, or whether an expert finds it reasonable, but whether a naive and possibly much less skilled individual can understand the content well enough to use the thing being documented.

What I was trying to achieve is simple: I was trying to get people to start thinking in terms of “how do I make this documentation clear and understandable for a wide variety of users who might be coming at it from a variety of backgrounds and skill levels.” My hope was that this would convince people to be more careful about how they wrote things and to take bug reports from users into account when writing.

However, there seems to be violent antipathy to this idea — antipathy that frankly shocked me. I was sufficiently surprised that, rather than simply walking away, I responded more vigorously than I probably should have. It would have been better for me to stop responding, but I was temporarily more interested in confirming what I was hearing and in attempting to restate my position (which seemed so obvious to me as to be difficult to argue against) that rather than do that I engaged more than I should have.

Regardless, as it appears to me that there is a general distaste for my position, I think I’ll be backing off on my attempts to convince people to improve the documentation, at least for now.

I don’t want to get into the ins and outs of this discussion, but one small suggestion I would make for the specific example being discussed is to change:

negate p v is not(p v)

to something like:

negate p is (fun v -> not (p v))

especially since the type in the mli has been written:

val negate : ('a -> bool) -> ('a -> bool)

For sure I am sincere. The only way to define a function with this type is to compose the argument with a function from bool to bool, and there is only four such functions: the identity, the two constant ones and the negation not. If we note the composition operator by %, then negate p = not % p.

What is amusing is that the way I reason : start from the consequence (what you want) and find the only conditions on which it is possible, i.e. the analytic or regressive method, is precisely the one that Kant used in the Foundations of Metaphysics of Morals. But I don’t think it’s the place to discuss the kantian solution of the well known chicken or the egg dilemma. :stuck_out_tongue:

I think you misinterpret the different reactions in this thread. There was no antipathy to this idea, but as @zozozo told you:

We were all agree that something should be done, we were not necessarily agree on what should be done.


Concerning the specific example being discussed:

  • Base (JST) has:
(** Negates a function. *)

val non : ('a -> bool) -> 'a -> bool

Rather succinct!
  • Batteries has (for years!):
File: /src/batPervasives.mliv

val neg : ('a -> bool) -> 'a -> bool
(** [neg p] returns a new predicate that is the negation of the given
    predicate.  That is, the new predicate returns [false] when the
    input predicate returns [true] and vice versa.  This is for
    predicates with one argument.

    [neg p] is [fun x -> not (p x)]
Not bad...

Quick note:
negate, non, neg.
One function, three names.
Another practice that complicates things.

1 Like

I suspect the issue here is that we have two (or maybe three) distinct groups: The old timers who just read the mli and understand and the new guys who need more help and examples. The third group live in the middle where the bare mli needs a bit more documentation.

Personally, I have always found good library documentation a great way to learn a new language, especially the core libraries

As a relative newcomer to ocaml, isn’t the problem with the suggested:

in fact that negate is a function which takes two arguments, a boolean valued function and a value, and returns a boolean value. The point is that in ocaml multi-argument functions are curried, and in normal use negate would be partially applied and as partially applied it might be passed to List.find or List.filter.

So far as any elucidation is required, couldn’t it just say “in normal use this function is intended to be partially applied”. But perhaps expressing the signature as:

val negate : ('a -> bool) -> ('a -> bool)

in place of:

val negate : ('a -> bool) -> 'a -> bool

already makes that obvious to a more experienced ocaml programmer reading the documentation.

1 Like

I have no joke here. I just like saying…

type 'a predicate = 'a -> bool
val negate: 'a predicate -> 'a predicate
1 Like