New Draft Tutorial on Polymorphic Variants

Dear OCamlers,

The OCaml.org team continues working on new tutorials. We have a draft on polymorphic variants; we’d like your feedback on it:

Previously announced tutorials form a series.

  1. Installing OCaml
  2. A Tour of OCaml
  3. Your First OCaml Program
  4. Values and Functions
  5. Basic Datatypes and Pattern Matching

But this one is not intended to follow right after it. It is designed to be taken by people familiar with OCaml’s basics and willing to master polymorphic variants.

As a draft, it has gaps, most notably:

  • The section on Performance Drawbacks needs to be strengthened
  • An example inspired by @garrigue “Code reuse through polymorphic variants” paper is missing

Share your feedback here or in GitHub, but do not use the “Contribute” link at the bottom of the staging page.

Hope it helps

EDIT - What’s the target audience?

This targets intermediate-level OCaml developers.

6 Likes

A few remarks in passing. I find it odd that that polymorphic variants are alone in the “Language” section. What other items will appear here ? (maybe GADTs ?). I think at some level it could also make sense to present them in the same section as objects since they share the same typing discipline (structural).

Also, I find it odd to present the usual nominal types as “a set-like thing” and polymorphic variants as something else, while this is usually the other way around (since they feature union and even intersections, at least internally). And indeed the rest of the presentation is full of analogies with sets
(sets of integers, union of these smaller types --in the sense of inclusion-- etc…). So this sentence at the beginning seems contradictory.

I haven’t read all this in details, but some of the examples are great, but definitely an advanced topic.

1 Like

I am sorry, but I really don’t understand why this tutorial put that much focus on typechecker algorithms or the difference between types and type schemes.

Who is the audience of this tutorial?

Why do they need to count the number of types in the type scheme [< `Broccoli | `Fruit of string ] just after the first (trivial) example of a function using polymorphic variants?

1 Like

“Product types and data types such as option and list are variants and polymorphic. In this tutorial, they are called simple variants to distinguish them from the polymorphic variants

So they’re variants and they’re polymorphic, but they’re not polymorphic variants? This is where I would probably stop reading.

4 Likes

IMHO the key thing missing is any motivation - the Introduction just tells me they are different than the previous types (which I can guess, otherwise why give them a different name and their own page). Then there is “Origin and Context” which gives me some history (I’m not a historian) and then tells me some things with different names are different again.

We then get a note on some previous types we should be familiar with - OK. And then it says the new types are different, and … something about vegetables.

I can signify different types of fruit and vegetables using backticks which is lovely and everything but I don’t write a lot of food-based code and you haven’t said why this is significantly better than what I already know.

The “Note on Simple Variants…” is a good clear, slightly technical recap of what I have already learned. Then I need to see an example where I want to do X and it is difficult with the stuff I already know. I have to read almost to the end of the article past covariance and contravariance to find anything about use cases and when I do it is all abstract.

So - if possible I think it would be good to get a concrete example of a problem with simple parametric polymorphism (is that the right term?) that polymorphic variants will solve and show them solving it. That needs to happen before you go into details of how they fit together and where the problems are.

7 Likes

I think that the introduction is very obscure and only talks to people who already know the many concepts. It sounds more more like a disclaimer to the experts :“hey guys, I’m going to present the things you know this way and not that way” rather that a gentle pedagogic introduction with examples and motivation.

2 Likes

Thanks @sanette, you are right. The introduction is too heavy. Some stuff needs to be removed, some stuff needs to be moved somewhere else in the document

Thanks @K_N

I find it odd that that polymorphic variants are alone in the “Language” section. What other items will appear here ? (maybe GADTs ?). I think at some level it could also make sense to present them in the same section as objects since they share the same typing discipline (structural).

Where this topic may go is open for discussion. You suggestion makes a lot of sense.

Also, I find it odd to present the usual nominal types as “a set-like thing” and polymorphic variants as something else, while this is usually the other way around (since they feature union and even intersections, at least internally). And indeed the rest of the presentation is full of analogies with sets (sets of integers, union of these smaller types --in the sense of inclusion-- etc…). So this sentence at the beginning seems contradictory.

You are right. This needs to be clarified.

My personal style would be to focus on a core example that would compare code a polymorphic variant implementation with a standard variant one. This might help to only introduce type system consideration to explain the behaviour of the examples.

If you want to avoid the classical example of arithmetic expressions because it feels too academic, why not focus on card games? This is a simple example where there is a non-trivial lattice of subtypes of card sets where one may want to define function by composing functions defined on some subtypes.

type number = [ `One | `Two |`Three | `Four | `Five | `Six | `Eight | `Nine | `Ten ]
type deck_52 = [ `King | `Queen | `Jack | `Numeral of number ]
type tarot_deck = [ deck_52 | `Knight | `Trump of int ]
let compare_d52 = ...
let compare_tarot x y = match x,y with
  | #deck_52, #deck_52  -> compare_d52 x y
  |  ...
2 Likes

Thanks @mobileink

So they’re variants and they’re polymorphic, but they’re not polymorphic variants? This is where I would probably stop reading.

I’ve struggled a lot when writing this. I ended up with this simple sentence, I’m open to suggestions on how to avoid your reaction.

A more general remark is that, for people to give feedback on these tutorials, some rationale is needed.
(apologies if there already is a document that I missed, it may very well be the case).

  • who is this for ? total beginners (even people who are not used to program) ? students ? programmers reasonably fluent in at least one language ? The point is that depending on the target audience the examples and style varies greatly. If the target audience is programmers, then you can sometimes reason by analogy with other languages (whether it’s a good thing to present a language that way is orthogonal).
  • what’s the expected structure ? Basics/Advanced language features/Tooling and ecosystem ? and within these broad categories, sub-parts. For Basics what is already there is pretty clear, but how to regroup advanced features is indeed trick (typing features, language constructs (e.g. use of attributes) etc…)).

For instance, when talking about polymorphic variants, it seems important to mention at some point coercion, but coercion is also useful elsewhere (e.g. private types, classes). So maybe a short tutorial on coercion would be interesting (i’m just thinking out loud).

Also, by several aspects, it seems that several tutorial are another take on those that are already in the language manual. If these are not good enough for some reason, it could be good to highlight why (and then maybe they could be directly improved).

And on that note, it seems really bad to hide the language manual (and in particular, but not only the standard library) from the official ocaml.org documentation page. If its a problem of integration with the whole site it’s a pity, but e.g., it seems better to land on this:
https://v2.ocaml.org/manual/stdlib.html
than on this
https://v2.ocaml.org/releases/5.1/api/index.html

The former looks like a table of content, the later like the index at the end of a book. It also needlessly
shows modules should not be advertise in any way (Obj, Camlinternal*,…)

Sorry it went a bit beyond the scope of the polymorphic variant topic, I do appreciate the overall effort to improve the documentation.

2 Likes

Thanks @octachron

I really don’t understand why this tutorial put that much focus on typechecker algorithms or the difference between types and type schemes.

The focus on type-checking was intentional. But maybe it was a mistake. Consider the example in Error Handling Using the result Type. How can we explain why polymorphic variants succeed here while simple ones fail? The data is essentially the same. Highlighting the role of the type-checking algorithm felt natural. But your feedback makes me think things should be told differently, putting the merge of types first.

At the meta-level, the choice of examples and descriptions can be guided by type theory. But I am not convinced that they need to appear in the text itself for a tutorial aimed to non-academic beginners. ((Also beware that this section speaks of monad as if it was an obvious term when monad is not part of the prerequisite. Catching Invalid_argument _ or using List.nth is also a programmer error)).

For instance, one point of view is that polymorphic variants are the dual to tuples or objects (as an implementation of a structural product type).

Then, we can introduce the the error use case by making the comment that locally, when we emit an error we don’t know (or don’t want to know) what are the other possible error cases. Which is similar to the fact, with a tuple the function

let f (x, _y) = Format.printf "x=%d" x

don’t add a constraint on y. Contrarily, with a record type

type vec2 = { x: int; y:int }
let f v = Format.printf "x=%d" v.x

we already know that vec2 only contains a field x and a field y, both with type int.

Coming back to error case, when we emitting an error with

let f ( ) = Error `Unavoidable_error

We only have a constraint on the returned type that it contain the type [ `Unavoidable_error ], contrarily to the variant case:

type error =
| Unavoidable_error
| Mistake
let f () = Error (Unavoidable_error)

With this, I have the impression that we can build the understanding on structural types by starting with the well known structural types, build some examples with polymorphic variants, warn readers about the open nature of polymorphic variants (and what it means for error messages), and at much later point discourse about the limitation of row type variables.

1 Like

In addition to what has been said, if aimed at beginners (maybe it isn’t) I should avoid referring to the ‘domain’ and ‘co-domain’ of functions. Functional programmers and mathematicians will know what you mean, but I don’t think the expressions are widely used in the more general programming world.

2 Likes

Agreed. The first order of business should be to answer a simple question: what is the point? Or, what problems do polymorphic variants solve?

I think this blog post does a good job of demonstrating one such problem.

1 Like

I think you are setting yourself a difficult task because there are so many different practical aspects to polymorphic variants, and this is a practical guide. You have done a great job getting where you are. For me, one key practical point is that the compiler will construct a polymorphic variant type for you and tag names can be shared and reused. So this won’t work with simple variants:

type t1 = T of int
type t2 = T of float
let v = T 10

Instead you have to specifically disambiguate

let v:t1 = T 10

Whereas this is fine:

let v1 = `T 10 (* v1 has type [> `T of int ] *)
let v2 = `T 20.2 (* v2 has type [> `T of float ] *)

You can even match on these variants polymorphically:

let get_t = function
    `T v -> Some v
  | _ -> None

Then, unfortunately this leads on to the notions of covariance and contravariance. It is because of covariance that a list can be constructed implicitly by the expression [ `T | `U ], and why polymorphic variants are useful for the Error case of the result type. It is because of contravariance that functions can match on sub-types of their argument type. This leads on to type intersections, where I think you have done well.

Everyone probably has his or her view of what a practical guide should cover.

2 Likes

Thanks @mobileink. I hadn’t spotted that post. It’s good indeed.

This feedback in this thread is great; in light of it, here is the current plan:

  1. Rewrite the introduction. State motivation and user-facing features
  2. Bring use-cases sections right after “1st example”
  3. Add code samples in use cases which don’t have it
  4. Downplay the role of the type-checking algorithms

Thanks @R_Huxton for your feedback, this is very useful.

I agree with you on the introduction: state motivation and remove history. This will be fixed.

I also agree on use cases; it makes sense to bring them forward in the structure of the document and add code samples for each of them (although the latter is not always easy, suggestions are welcome).

My question is about the vegetable thing. I peeked that to avoid using single letters. But I understand this group of names can feel like cognitive noise. Is that the case for you?

To me this is fundamental. But I think its more than just practical, it changes the nature of the language. It makes “type inference” systematically ambiguous, since it now means two things:

  1. inference to defined, named types of values
  2. inference to new unnamed types from untyped values (personally I think “anonymous (lambda?) variant types” is a much better name than “polymorphic variants”)

(compare intro/elim rules in natural deduction)

It also makes type inference analogous to lexical/dynamic binding. “Normal” type inference is like lexical binding, it happens in the context of definitions (or declarations); inferring types for polymorphic variants is like dynamic binding in that it is determined by context-of-use.

1 Like

“What is displayed by the type-checker, for instance [< Broccoli | Fruit of string ] , isn’t a single type. It is a type expression that designates a constrained set of types.”

Personally I think it’s a mistake to call these things type expressions. That’s misleading and potentially confusing. They do not express types. Rather they express constraints on type inferencing. So something like x : [< ...] does not say what type x has, it expresses constraints on the type inference required to use x in an expression. So in fact introducing PV “types” changes the semantics of the : operator (as you mention in the tutorial).

“Polymorphic variants types are type expressions.”

Types are not type expressions. I’d like to say that [< ...] and [> ...] are PV type constructors, but since they don’t actually construct types that’s not quite right. Maybe we should use “type constraint”, so they become “type constraint constructors”; or maybe just stick with “type constraint expressions” and call them “operators”.

“The structural typing algorithm used for polymorphic variants creates type expressions that designate sets of types …”

I’d just say that the compiler can infer/derive typing constraints from “tags” - except I wouldn’t use “tags” at all, I think those things are better called something like “ad-hoc variant data constructors” - and it expresses those constraints as a PV-expression, i.e. [>...] etc. A PV-expression expresses the possible types a value or variable might have.

2 Likes