Inferred types and polymorphic variants

I understand why (1) when the output of a function is a polymorphic variant type, it gets a lower bound from the tags used in the function, and (2) when the input of a function is a polymorphic variant type, it gets an upper bound from the tags matched against if there is no catch-all case. Together these restrictions prevent a function being applied to an argument that it couldn’t deal with:

# let foo x = match x with `Foo -> 0 | `Bar -> 1;;
val foo : [< `Bar | `Foo ] -> int = <fun>
# let baz = `Baz;;
val baz : [> `Baz ] = `Baz
# foo baz;;
Error: This expression has type [> `Baz ]
       but an expression was expected of type [< `Bar | `Foo ]
       The second variant type does not allow tag(s) `Baz

This is good, because the match in foo has no clause for Baz, so it wouldn’t know what to do with that argument.

However, I don’t understand why, when there is a catch-all case in a function, it instead gets a lower bound from the tags matched against.

# let foo2 x = match x with `Foo -> 0 | `Bar -> 1 | _ -> 2;;
val foo2 : [> `Bar | `Foo ] -> int = <fun>

The only effect of this that I can see is to prevent the function from being applied to arguments that it would have no problem dealing with. With this definition of foo2, we can indeed call

# foo2 `Baz;;
- : int = 2

but a slight modification makes it fail:

# let baz : [`Baz] = `Baz;;
val baz : [ `Baz ] = `Baz
# foo2 baz;;
Error: This expression has type [ `Baz ] but an expression was expected of type
         [> `Bar | `Foo ]
       The first variant type does not allow tag(s) `Bar, `Foo

Of course, given the inferred typed of foo2, it makes sense that by constraining the type of baz I’ve chosen to prevent foo2 baz from being valid. But why is the type of foo2 constrained in that way? Intuitively, looking at the definition of foo2 it seems that it should be able to deal with any polymorphic variant input at all, and just chooses to do something special with two of the possibilities.

In particular, this behavior means that adding a catch-all case to a function makes certain operations that used to typecheck not typecheck any more.

# let bar : [`Bar] = `Bar;;
val bar : [ `Bar ] = `Bar
# foo bar;;
- : int = 1
# foo2 bar;;
Error: This expression has type [ `Bar ] but an expression was expected of type
         [> `Bar | `Foo ]
       The first variant type does not allow tag(s) `Foo

This is counterintuitive to me.

(I assume you meant foo2 instead of foo in the above.) It is not correct that foo2 can deal with any polymorphic variant input at all. For example, foo2 cannot be applied to `Bar 12 since pattern matching dispatch is based on the constructor name, the type system must ensure that the `Bar and `Foo constructors must not have any arguments when used with the function foo2. This is precisely what the inferred type enforces.

Cheers,
Nicolas

Ah, interesting, thanks. So the [> `Foo | `Bar] in the type of foo2 isn’t really for enforcing "at least the tags `Foo and `Bar" but rather “if the tags `Foo and `Bar are present, they must have these types”. It seems confusing to use the same type for this as in the output of, say [ `Foo ; `Bar] which really does mean that at least the tags `Foo and `Bar must both be considered.

Yes, except that it should be “or” instead of “and”.

But actually it makes perfect sense, as this type enforces the exact same invariant (namely, that if the constructor `Foo or `Bar are included in this list they must not carry any arguments). Indeed, it is not possible to consider things like [`Foo; `Foo 12] (for same reason as discussed before for the argument of foo2).

Cheers,
Nicolas

Are you saying that [> `Foo | `Bar ] really just means “if `Foo or `Bar is included it must not carry any arguments”? That seems contrary to the manual which says

[>`Off|`On] list means that to match this list, you should at least be able to match `Off and `On, without argument.

Also, if that’s what [> `Foo | `Bar ] means, then how does [< `Foo | `Bar > `Foo] differ from [< `Foo | `Bar]? If the only allowed constructors are `Foo without arguments and `Bar without arguments, then doesn’t that imply already that whenever `Foo is included it must not carry any arguments?

No, I’m not saying that the type is defined by that property, only that the property is enforced by it. A naive way of understanding the full meaning of the type [> `Foo | `Bar] is to think of it as being [ `Foo | `Bar | alpha ] where alpha is a type variable that can be unified with other polymorphic variant types disjoint from [ `Foo | `Bar ] (here disjoint means using different constructor names).

This explains why (`Baz : [ `Baz ]) does not belong to the type [> `Foo | `Bar] (there is no way to unify the hidden type variable of [> `Foo | `Bar] to make this type compatible with [ `Baz ]).

On the other hand, the type of (`Baz : [> `Baz]) itself contains a type variable which can be unified with [> `Foo | `Bar] giving the common type [> `Foo | `Bar | `Baz ]. In short, we see why (`Baz : [> `Baz]) belongs to the type [> `Foo | `Bar] but (`Baz : [ `Baz ]) does not.

All this is rather hand-wavy. If you want a more precise explanation one good source is http://gallium.inria.fr/~fpottier/publis/emlti-final.pdf, section 10.8 (“Rows”).

Cheers,
Nicolas

1 Like

Thanks. That reference is pretty dense, but your sketch is helpful. Can you give a similar sketch of the meaning of [< `Foo | `Bar]?

Here the idea is that this type is [ alpha ] where alpha is a type variable that can only be unified with polymorphic variant types defined by a subset of `Foo and `Bar (without arguments).

Hence this type can be unified with [ `Foo ] and [> `Foo] and [ `Foo | `Bar ] and [> `Foo | `Bar] but not with [ `Foo 12 ] or [> `Baz ].

Finally, the type [< `Foo | `Bar > `Bar] can be understood as [ `Bar | alpha ] where the variable alpha can only be unified with polymorphic variant types defined using a “subset” of `Foo. So it can be unified with [ `Bar ] and [> `Bar] and [> `Bar | `Foo] and [< `Foo | `Bar] and [ `Foo | `Bar] but not with [`Foo] or [> `Bar 12].

Cheers,
Nicolas

1 Like

Ok, I think I get it. Thanks!