let x () = `A
let _ =
match x () with
| `A -> ()
let _ =
match x () with
| `A -> ()
| `B -> ()
In this program I’m wondering when does the type checker “close” the variant in the return type of x (), to conclude that the first pattern match is exhaustive, while also allowing the second pattern match which has a clause that will never match.
My understanding is OCaml uses row polymorphism under the hood. So the type of x is something like: (not in OCaml syntax)
forall (r : row) . () -> [A, ..r]
(where ..r indicates row extension)
So when we instantiate the type in a use case we get [A, ..r1] (where r1 is a unification variable).
In the first pattern match, r1 needs to be unified with an empty row for the type checker to conclude that the pattern match is exhaustive.
In the second pattern match, it needs to do the same, but after unifying the type of pattern B with the scrutinee type.
When exactly unification of the row variable with empty row is done?
My guess: it type checks this as usual (for each branch, unify LHS with the scrutinee type, unify RHS with the type of the whole expr), then “closes” the variant type to check for exhaustiveness.
The actual typing of the pattern is more involved, but yes the typechecker decides to keep open inferred variant types or not in order to improve the exhaustiveness of pattern matching
More precisely, the type of the single pattern `A might be closer to [?`A | 'r ] where ?`A means that `A can be either present or absent and 'r is the row variable.
In the first step, the unification of the pattern type then goes through each row of the pattern match, giving you a temporary type [ ?`A | ?`B | 'r ].
In the second step, after analysing the pattern matrix, those inferred polymorphic variant types are kept open if this doesn’t make the pattern matching not exhaustive.
For instance, in
let f x y = match x, y with
| `A, `X -> ()
| _, (`X|`Y) -> ()
the pattern is still exhaustive if the type of the first x column is kept open, but the type of the second y column must be closed. Consequently the type of f is inferred as [? `A | 'r1 ] -> [< ? `X | ?`Y | 'r2 ] -> unit. The row variable 'r2 is still here because we could infer later than in fact `X or `Y is present or absent in the type.
At this point, for closed polymorphic variant types, we are back to the surface language for polymorphic variants: [ < ?`X | ?`Y | 'r2 ] can be read as [< `X | `Y ].
Finally as a last step, we go back to the surface presentation for the open variants by marking as present all constructors appearing in them (if there are no conjunctive types). Thus [? `A | 'r1 ] becomes [> `A ].