I’ve written a three-part tutorial on GADTs. The first part shows the very basic. It is intended for anyone who already knows OCaml but hasn’t written a GADT before. The second part is a collection of different techniques you can use when making your own GADT. It is intended as a follow-up of the first part or for programmers who have written GADTs before but feel like they still have things to learn. The third part is a gallery of GADTs found in the public OCaml software ecosystem. It’s intended for anyone who wants to see practical-use examples.
I intend to expand the gallery of GADTs (part three). Let me know if you have suggestions.
You don’t need a -> to have a GADT. With a GADT you can do the following
type _ tag =
| F : float tag
| S : string tag
let f (type kind) (case : kind tag) (v : kind) =
match case with
| F ->
(* `F` has type `float tag`,
thus `case` has type `float tag`,
thus `kind = float`,
thus `v` has type `float` *)
print_float v
| S ->
(* same here but with string instead of case *)
print_string v
let () =
f F 42.0;
f S "42";
f F "42"; (* type error *)
f S 42.0; (* type error *)
The type of v depends on the value of case !
You could of course do:
type tag =
| F of float
| S of string
let f = function
| F f -> print_float f
| S s -> print_string s
But then you have some useless boxing going on.
You could do the same without boxing and with Obj.magic but this is unsafe:
type tag =
| F
| S
let f case v =
match case with
| F -> print_float (Obj.magic v)
| S -> print_string (Obj.magic v)
let () =
f F 42.0;
f S "42";
f F "42"; (* no type error, will segfault or print garbage *)
f S 42.0 (* no type error, will segfault or print garbage *)
Nice post! One thought on Part 2, which alludes to a correspondence between hlists and tuples. Maybe it would be helpful to illustrate how far the similarity goes, as in the example below: for both stringify_hlist and stringify_tuple, the compiler is able to infer the correct argument type, and unlike with a regular list, it knows that both of these pattern-matches are exhaustive.
type _ hlist =
| [] : unit hlist
| (::) : 'a * 'b hlist -> ('a * 'b) hlist
let stringify_hlist: _ -> string list = function
| [a; b; c] ->
[Int.to_string a; b; Float.to_string c]
let stringify_tuple: _ -> string list = function
| (a, b, c) ->
[Int.to_string a; b; Float.to_string c]
I completely agree but also I start the blog-post by “You probably don’t need GADTs” and “you can use some other OCaml features to encode many of of the invariants that you might care about”, and I end the first part with “This GADT is sort of useless” and “A better alternative is to use a private type alias” so I think the tutorial already covers that.
(As a side note: I prefer the private type alias to your nonempty type+constructor. That because you can have zero-allocation casts between the vanilla list type and the non-empty list type. And more importantly because it let’s you use the Stdlib.List module in full rather than to have to rewrite functions.)
The last part of the tutorial shows useful uses of GADTs. The first part is about teaching you the basic syntax and rules.
Yeah it doesn’t work very well on my android either. The code is far too small. I’ve tried to fix the CSS but it turns out I’m not any good at it. If you have a fix to suggest for the CSS I’d be happy to try it out and apply it.
I have to say, I don’t know what the “Generalized” part of the name means. Probably there’s someone on this forum who has the answer.
As for the distinction between ADTs and GADTs it’s not a syntactic one. Indeed, you can write non-generalized ADTs with the newer syntax. The distinction is that the type parameter on the right-hand side of the -> in the different constructor declarations is either
repeated all the same on all the constructors:
(* not a GADT *)
type _ option =
| None : 'a option
| Some : 'a -> 'a option
different for different branches:
(* a GADT *)
type _ v =
| Char : char v
| Int : int v
Using abstract types as type-level tags is a dangerous anti-pattern, since the typechecker can only prove that they are not equal inside the current module. To make clear that the values are not intended to be used it is better to define private constructor:
type empty = private Empty_tag
type nonempty = private Nonempty_tag
which keeps a proof that the type two types are not equal available to the typechecker because Empty_tag ≠ Nonempty_tag.
GADTs are generalized because they introduce the possibility to have existentially quantified type variable in the argument of the constructors:
type any_list = Any: 'a list -> any_list
and to refine the constructed type in function of the constructor
type _ numty =
| Int: int num
| Float: float num
In other words, observing a constructor allow you to learn type equality about the type in the corresponding case.
Both features are not possible with a standard ADT . (The two features are partially independent, and before the final design GADTs, they were separate discussions on how to have existential quantification in variant types, but the feature was very easy to add to GADTs as “constructors + type equations” so
both features were introduced in the same package) .
Going back to the type equality aspect, this is why the eq type constructor,
type (_,_) eq =
| Eq: ('a,'a) eq
is in many way the quintessential GADT, which can be used to encode other GADTs.
For instance, the above numty is equivalent to
type 'a numty' =
| Int' of ('a,int) eq
| Float' of ('a,float) eq
To check that the two types are equivalent, we can write a zero function for both types:
let zero: type n. n numty -> n = function
| Int -> 0
| Float -> 0.
let zero': type n. n numty' -> n = function
| Int' Eq -> 0
| Float' Eq -> 0.
Moreover, there was originally some ambiguity in the naming, Guarded Algebraic Data Types is also meaningful: the variants are “guarded” by type equations that apply to the particular variant. (For example: “A Constraint-Based Approach to Guarded Algebraic Data Types” by Vincent Simonet and François Pottier.)
That’s the main reason. Something like that is mentioned in the tip/trick about making the property constructors private. But I think it’d be worth mentioning that leaving out the constructors is ok if the types are not exported. I’ll add that later.
Ok, I think I see see where you’re coming from. Problem is, F and S are not values. They’re unary ctors. So the type of v depends on the type of the arg used to parameterize the ctor. Which again is not the same as depending on the value of case.
Which means, if I’m not mistaken, that the standard notion “Dependent Type” is not adequate for this situation. (See Dependent type - Wikipedia or the HoTT Book)
Which imho is pretty dang interesting. Would this be sufficient to characterize GADTs? Dependent types where the dependency is not just on a value (ctor), but on the type parameter of a ctor.
Well F and S are constructors, but used as an expression they evaluate to values. As in
let x = F
let y = S
and we often say that F and S are values.
In this case, the type of the value that the variable x is assigned to is different from the type of the value that the variable y is assigned to. Different values, different types, no argument parametrising the constructors.
In fact, in @zapashcanon example there are no parameter for the constructors. So you can’t depend on “the type of the arg used to parameterize the ctor”. No such thing exists.
The Wikipedia article on GADTs has a Haskell example that suggests to me that “Specializable” would be more informative than “Generalized”. GADT ctors can produce e.g. Expr bool or Expr int, but ADTs can only produce Expr 'a. Does that make sense?
Incredible tutorial! I’d already read a couple other materials on the subject (RWO, and that hlist article everyone links to on diff-list typing tricks). This was very clarifying as to why and how we might use GADTs.