Polymorphic variants and module signatures

I am following along with the section on polymorphic variants in RWO. Specifically, I am playing with this .mli file and this .ml file, except I changed the definition of extended_color_to_int in variants-termcol-fixed.ml to be

let extended_color_to_int = function
  | `RGBA (r,g,b,a) -> 256 + a + b * 6 + g * 36 + r * 216
  | `Grey x -> 2000 + x
  | (`Basic _ | `RGB _ | `Gray _) as color -> color_to_int color

Just like it says in RWO, this function definition compiles without error, even though we’ve inadvertently added a new polymorphic variant constructor by misspelling “Gray” as “Grey”. In order to get an error, we have to explicitly add a type annotation:

let extended_color_to_int : extended_color -> int = function

Why? As I understand it, the type definition in variants-termcol-fixed.mli defines the exact tags that are a part of the type extended_color, so a function that pattern matches on a value of type extended_color would have to match that exact set of tags, no more, no less. And we’ve also explicitly stated in the .mli that extended_color_to_int takes a value of type extended_color. Nonetheless, RWO explains: “All that happened was that the compiler inferred a wider type for extended_color_to_int, which happens to be compatible with the narrower type that was listed in the mli.” If we really have specified the exact set of tags that are allowed to be included in the type extended_color, then why is the compiler allowed to infer a “wider type”, and how can this type be “compatible” with the type we’ve supplied?

1 Like

Apparently new users are only allowed 2 links per post, so here is a link to the section on polymorphic variants in RWO. The bit in question is near the end of the section.

Consider this simpler example in utop:

# let f = function `A -> 0 | `B -> 1 | `C -> 2;;
val f : [< `A | `B | `C ] -> int = <fun>

Notice the inferred type of f. It contains <. This means f can handle values of type `A, `B, or `C, or less than that set of possible values. Consider:

# type t = [`A | `B];;

# let x : t = `A;;
val x : t = `A 

# f x;;
- : int = 0

When you don’t ascribe a type, the compiler infers the most general one. However, you could ascribe a more restrictive type:

# let f : [< `A | `B] -> int = function `A -> 0 | `B -> 1 | `C -> 2;;
Characters 58-60:
Warning 11: this match case is unused.
val f : [< `A | `B ] -> int = <fun>

The definition is valid, though you are given a warning. You don’t get this warning when compiling an ml file against a more restrictive mli. That’s because it is often useful to do more inside of an ml than you want to expose with the mli.


Thank you @ashish for that thoughtful answer. It prompted me to play around a bit with polymorphic variants in utop, and I think I now have a much better understanding of what kinds of polymorphic variant types are inferred in various situations. However, I’m still not completely sure I understand how type checking works for these. Here’s a minimal example to illustrate:

# let f = function `A -> 0 | `B -> 1;;
val f : [< `A | `B ] -> int = <fun>
# let g : [ `A | `B ] -> int = f;;
val g : [ `A | `B ] -> int = <fun>

Even though the type of g is technically different from the type of f, this type-checks just fine. I know nothing about OCaml internals or type theory, but this example seems to suggest that type checking is much more complicated in the case of polymorphic variants. For other “less advanced” aspects of the language (e.g. built-in types, records, and non-polymorphics variants), it seems like a program will type check only if the type of all values matches their respective inferred or annotated types. For example, let x : int = "hello" does not type check because we say that x will be bound to a value of type int, but "hello" has type string, and int and string are not the same type. But when you look at polymorphic variants, suddenly the type system isn’t simply binary (meaning a value isn’t simply of a certain type or not), but rather more like a set, and type checking consists of verifying that the type of a value is a subset of what is allowed. That is significantly more complicated! And what’s more, there appears to be a little magic involved, as I found in the OCaml docs:

The > and < inside the variant types show that they may still be refined, either by defining more tags or by allowing less. As such, they containan implicit type variable.

Indeed, I confirmed in utop:

# type t = [ `A | `B ];;
type t = [ `A | `B ]
# type t = [< `A | `B ];;
Error: A type variable is unbound in this type declaration. 
In type [< `A | `B ] as 'a the variable 'a is unbound
# type 'a t = [< `A | `B ] as 'a;;
type 'a t = 'a constraint 'a = [< `A | `B ]

I have no idea what’s going on in that last type declaration (I’ve never come across the as keyword in a type decl before and I don’t know what it does or what’s it’s useful for), or why there’s an implicit type variable in some polymorphic variants, but I think maybe understanding those things would start me down a road to understanding what goes on underneath the hood with regards to polymorphic variants. Explanations of or feedback on any of the above obversations are welcome.

TLDR: How does type checking for polymorphic variants work under the hood?

Indeed. Here is a list of papers that you could check out.

A few years ago, the common advice was to use regular variants instead of polymorphic variants because polymorphic variants get confusing. That seems less true now. The community comfortably uses polymorphic variants often, probably because they do have several benefits (no need to pre-define them, and the sub-typing aspect).

1 Like

I would approach this from the perspective of subtyping and co/contravariance. So my intuition (and I don’t know if it’s correct here) is that f can be assigned to g because the type of f is a subtype of the type of g. I.e.,

[< `A | `B ] -> int is_a_subtype_of [ `A | `B ] -> int

So, because of function contravariance,

[ `A | `B ] is_a_subtype_of [< `A | `B ]

This makes sense to me because [< `A | `B ] is a type that can potentially have less cases than [ `A | `B ], hence the latter seems to be more specific and thus the subtype.

However, I’m still not completely sure I understand how type checking works for these

Your example here is really just an instance of polymorphism. It is very similar to this example:

# let f = [];;
val f : 'a list = []
# let g : int list = f;;
val g : int list = []

In this case the type of f is polymorphic in the type of the list’s elements. In your example, f is polymorphic in whether the input type allows A or B tags. It is not clear from the syntax but there is a hidden type variable inside any [< ... ] or [> ...] type.

Again you can see similar behaviour with your type definition examples:

# type t = 'a list;;
Error: Unbound type parameter 'a
# type 'a t = 'a constraint 'a = 'b list;;
type 'a t = 'a constraint 'a = 'b list

I would approach this from the perspective of subtyping and co/contravariance.

Whilst polymorphic variants do have a subtyping relation, that is only used when using the :> syntax to cast them. In these examples subtyping is not involved, and so the variance of the type does not matter. For example,

# let f () : [< `A | `B] = assert false;;
val f : unit -> [< `A | `B ] = <fun>
# let g : unit -> [`A | `B] = f;;
val g : unit -> [ `A | `B ] = <fun>

Here the row polymorphism allows the type to be specialised to [A | B] even though it is in a covariant position.

1 Like

Possibly my old lecture notes would be a useful read:


and accompanying slides:



Thank you for the links to the lecture notes. I had actually found those notes myself yesterday. Reading them quickly lead to more questions, and at this point I can see that although I started out asking a specific question about polymorphic variants, what I’m really interested in is a broad knowledge of type theory and it’s practical applicability to programming in functional languages. Luckily, it looks like this is precisely what your course is about! Therefore I’m content to leave my specific questions about polymorphic variants until I’ve gotten my fundamentals straight.