New lesson on polymorphic variants

Hello,

I wrote a new lesson on polymorphic variants.

You can find it here.

I appreciate any feedback you may have. I expect that if there is interest in including a lesson on polymorphic variants that there will likely be several rounds of refinement.

Sources:

This lesson is about 800 lines long (about 1100 with line length limited to 85 columns). This makes this lesson on the longer side when compared to other lessons on OCaml.org. Therefore, this is the first of 2 lessons on polymorphic variants.

This lesson (lesson 1) introduces the concepts behind polymorphic variants (such as row polymorphism and structural typing), then discusses common syntactic structures of polymorphic variants. It teaches these concepts in a bottom-up direction. It is my subjective belief (held lightly) that introducing polymorphic variants in a top-down direction leads to more complexity and confusion.

Lesson 2, which is forthcoming, introduces common usecases for polymorphic variants through real-world examples.

Any feedback a reviewer is willing to provide is greatly appreciated. The author is particularly interested in ensuring accuracy and validity of examples and consistency in the language with OCaml.org’s other materials, but all feedback is welcome.

5 Likes

The use of boldface in “An Initial Mental Model” seems very arbitrary to me, and I don’t understand why some phrases are bolded. In the quotes sections below I would expect nothing to be bold, or only “Ordinary variants” and “Polymorphic variants”. I was unable to find other content on OCaml.org which uses font styling like this.

  • Ordinary variants, like files in folders, exist in a tree-like structure. They are local to a module or type definition and are accessed via module paths. This structure makes them predictable and type-safe, and the compiler helps us track and enforce their usage precisely.

Calling ordinary variants type-safe and then declining to describe polymorphic variants the same way seems to imply to me that they are not. I also don’t understand how accessing something via a module path makes it type-safe. I also don’t quite understand what the tree-like structure mentioned is. Is it just modules? If so, I am not sure how that is inherent to ordinary variants.

  • Polymorphic variants, like labels, are part of a graph-like system. They are not tied to a particular module scope and available throughout the program. This allows them to be composed and reused in ways that ordinary variants cannot, but this flexibility can come at the cost of increased cognitive overhead for the developer.

I don’t understand what the graph structure alluded to here is.

It seems like the idea of this section is to foreshadow later discussion of structural typing, but the example immediately following this of a program with different tags for messages seems to illustrate this more clearly.

3 Likes

I immediately stumbled over the “files in folders” simile. What are the folders? What are the files? Why are the files in the folders?

1 Like

I agree that the analogy doesn’t fit, and I have the impression the analogy is confusing type abbreviations for a polymorphic variant types with the polymorphic variants themselves.

Overall, my impression is that the current plan fails to motivate the use of polymorphic variants due to a lack of concrete non-trivial examples. In particular stating the expression problems, or at least a subcase of it would help a lot to refine vague words like flexibility by showing rather than telling (I disagree with the point on evolving data: polymorphic variants are not better than ordinary variants for this).

Some more remarks:

  • Type abbreviations are not type declarations, one cannot declare polymorphic variants
  • I don’t know where you picked the name for “open bound”, “close bound”, and “close bound with open bound refinement”’ for the lower and upper bounds on the row variable, but using the right names would make the notation much easier to understand. It would also probably help introducing the row variable and explaining the implicit row variable in the declaration.
  • Structural typing is primarily associated with arrow types and tuples, not only polymorphic variants and objects.
  • Row polymorphism is orthogonal to structural typing: it is a form of parametric polymorphism.
  • Subtyping and structural typing are unrelated, I would also avoid introducing subtyping in the row polymorphism section to avoid confusion
2 Likes

@kopecs @cjr @octachron I appreciate you taking the time to provide feedback.

I see what you mean about the analogy, its well taken. When I first began using polymorphic variants and they began making sense, my mental model was thinking about “ordinary” variants as having a localized scope based on module-paths while polymorphic variants can be thought of as globally scoped, and the folder vs label analogy came to mind. I introduced the concept to some friends in this way before going over the notation with some examples and it seemed to point them in the right direction, but I was able to supply the caveats that it is a loose analogy, and we didn’t stick to it the analogy very long. With some distance from the text over the last few days and your concerns well stated, I agree that it lacks precision and isn’t well suited to a lesson. I removed it.

If you happen to have a better analogy, I’d be interested to hear it. Otherwise, the need for an analogy survived from the first draft and upon reflection I feel like it isn’t required anymore given the rest of the text.

@cjr Agreed, the analogy doesn’t map well under scrutiny

@kopecs I also agree with the abundance of bolding. I used it during editing to help me with pacing. I got used to it and didn’t think to take it out. Like you said, it is highly subjective and arbitrary. Removed.

@octachron By “evolving” I intended to capture the process of developing a state machine as smonat described here, or the process of incrementally defining functions as mentioned in the manual here, and I agree that my phrasing doesn’t express it well and got lost in edits. Incrementality would be a better term than evolving.

@octachron To your larger point about motivating the subject with concrete examples, I agree that those are vital, and I am currently working on a lesson that uses Garrigue’s paper as a starting point, with additional examples motivated by message board discussions. I am curious which would make for a better first lesson: the motivation (through common use-cases) or the notation/cs concepts (through more abstract \A's and `B`'s). It may be that the motivation lesson is better suited as the first lesson, or that they are merged into one. I’m not sure at this point.

For the other points, I’ll look into them further and make updates. I really appreciate the additional depth and clarity it brings.

Incidentally, lessons on polymorphic variants and GADTs share a “cart before the horse” problem, where the notation and cs concepts benefit a later discussion on motivation, but the same is true the other way around. Which is the cart, which is the horse? Its a subjective call, and I’m happy to figure out which way works for OCaml.org as these lessons are improved.

Thanks again for the feedback. I’ll reach out again for your thoughts once the 2nd lesson is ready.

One simple plea: never ever write “variants” or “polymorphic variants” without a following “types” or “constructors”. There’s a big difference and virtually all tutorials I’ve seen routinely conflate the two, and end up reading like nonsense.

1 Like

@mobileink I understand what you mean, and I feel the same way, but am likely to make mistakes without a formal background. For example, I ended up misrepresenting type abbreviations as type declarations as octochron pointed out. Given the following error message, I hope its understandable why I did so:

# type typ_2 = [> `A | `B | `C ];;
Error: A type variable is unbound in this type declaration.
       In type [> `A | `B | `C ] as 'a the variable 'a is unbound

If you have any examples in the lesson that do not make a clear distinction between a “variant type” / “polymorphic variant type” and “constructors” / “tags”, or examples you can think of that tend to be confusing, I can make sure the text emphasizes the correct usage and would greatly appreciate it.

Here’s an example of general confusion introduced by the standard terminology: “…ordinary variants define constructors for typed values while polymorphic variants define tags for typed values.” “Tags”? Really? What that really means is “we cannot explain this so we’ll use a nonce word”. In fact, as far as I can see, the backtick is a ctor that is exactly analogous to the lambda operator. Lambda constructs a value of some determinate function type. Backtick constructs a value of some indeterminate inductive type. More precisely, it (meta)-constructs a constructor (aka “value”). So instead of “tag” we should say something like “ctor-expression”, just like “lambda expression”.

1 Like

A second point that I think is underappreciated. Ordinary variant types are indeed defined. Literally: finite, determinate, unambiguous. Not so with polymorphic variants. (Notice I did not say “polymorphic variant types”. Types are always definite and determinate.) A PV “type” is in fact a type family, or something like that. Indeterminate (albeit constrained.) So a PV ctor-expression like ``A` never has a determinate type. Not a definition. To put it differently, the ctors of an ordinary variant type define (i.e. limit) the type. That is not the case with PV ctor-expressions (aka “tags”).

1 Like

I take your point that these terms are not used with exact precision and can benefit from more clarity. I think this highlights why polymorphic variants are so tricky to introduce to beginner, intermediate and even experienced developers not coming from a functionally typed background.

Several OCaml topics can be taught by example with tangential reference to type-theoretic terminology or implementation details, but polymorphic variants and GADTs in particular can require pointed reference to their underpinnings to provide an accurate model for how they work, which introduces additional complexity and difficulty to the lesson.

It makes sense to me that this would result in a gap between beginner material and advanced material, and I find myself navigating this intermediate zone frequently. Indeed, this is why I am challenging myself to write this material to force myself to research and compose these concepts for my own benefit and hopefully produce material that acts as a bridge for others in the future.

Thanks to everyone’s feedback, the current lesson needs another few passes to incorporate the feedback received, but when it is ready I hope I can reach out to you to validate the updates and solicit additional feedback. I’d be thrilled to work with you to iron out more inexact language. We clearly have the same goals in mind.

In the meantime, I added the following clarifying note to the section introducing the choice of terminology introduced in the section “On Constructors vs Tags”:

"
Note: Tags are technically constructors for a polymorphic variant and can be referred to as such, but for clarity we will use the term “tag” to refer to constructors of polymorphic variants and “constructor” to refer to constructors of ordinary variants. The distinction is that constructors (of ordinary variants) construct a predetermined type, and tags (of polymorphic variants) construct an adhoc type that is indeterminate until applied to a specific value’s type. This means that the exact type of a polymorphic variant remains open and is determined by the operations and contexts in which it is employed.
"

At this time, I am not fully convinced that polymorphic variants are never fully determined, but that may either be a misunderstanding of your comment or reflects my lack of continued lack of complete understanding. The note may solicit additional feedback, but for now I wanted to demonstrate that I am taking your feedback seriously and trying to find a way to reflect it in the lesson’s flow and at an approachable complexity level.

In an upcoming rewrite, I will play around with replacing “tag” with “ctor-expression” and see how well it maps onto the lesson on variants in Basic Data Types and Expression to make sure they compose well with each other. The use of “tag” is mostly a semantic tool for conveniently distinguishing between the two kinds of variants.

Are you really sure ? The manual proposes a short, to the point and beginner friendly introduction to polymorphic variants, I’m not sure its worth adding much more.

It is going to subjective and contingent on the mission of the document, and I’m not suggesting the manual should be changed.

I too would describe the manual as concise and approachable for OCaml beginners, and add that I find it to be appropriately scoped. However, the manual doesn’t contain elucidations on underlying concepts. As an example, chapter 5 of the manual (Polymorphic variants) makes no explicit reference to and explanation of row polymorphism, structural typing, and the process of refinement, which I find useful for understanding how polymorphic variants work. These topics are perhaps better suited to being expanded upon in a lesson.

An assumption could be made that lessons are geared towards developers familiar with strong static typing concepts, and if that is the case the lesson can be simplified significantly.

I think GADTs are a good example for how concepts in the manual can be expanded upon effectively. The manual provides a concise and beginner-friendly introduction to GADTs, and the longer treatment of the subject in RWO does a good job of tying the concepts together with examples in the narrative of a lesson. I’m the type of person that appreciates the sentiment behind RTFM, and I read the manual early in my OCaml journey, but it was RWO that made it clear to me what the motivation for and utility of GADTs was.

For another data point to consider, I appreciated the perspective @cuihtlauac introduced in his lesson on polymorphic variants from 2 years ago noting that LLMs hallucinate on the subject of polymorphic variants extensively. I forget exactly where it was noted, but the suggestion was made that this provides motivation for a lesson that treats the subject more explicitly. Two years later, I too find LLMs unreliable on the subject of polymorphic variants. On the subject of GADTs, however, LLMs clearly benefited from the text in RWO.

When the second draft is ready, I hope I can reach out to you for feedback on the lesson. It will be valuable to hear feedback that puts a downward pressure on the scope of the lesson.