ChatGPT explains GADTs

I asked ChatGPT: “Explain GADTs in OCaml” and it came up with the following answer:

GADT stands for Generalized Algebraic Data Type. In OCaml, GADTs are a language extension that allows defining more precise types for data structures.

In a traditional algebraic data type (ADT), each constructor has a fixed type signature. For example, consider a simple ADT for expressions:

type expr = 
  | Int of int
  | Plus of expr * expr
  | Times of expr * expr

Here, Int constructor has type int -> expr, Plus constructor has type expr -> expr -> expr, and Times constructor has type expr -> expr -> expr.

In contrast, a GADT allows us to define different types for each constructor. We can use this to express more precise type information about the constructors. For example:

type _ expr =
  | Int : int -> int expr
  | Plus : int expr * int expr -> int expr
  | Times : int expr * int expr -> int expr

Here, each constructor has a different type signature. Int constructor has type int -> int expr, Plus constructor has type int expr -> int expr -> int expr, and Times constructor has type int expr -> int expr -> int expr.

The _ symbol before the expr type parameter is a type variable that represents the “unknown” type. This allows us to define different types for different constructors.

By using GADTs, we can enforce more precise typing constraints on our data structures. For example, in the above example, we can now ensure that Plus and Times constructors only accept integer expressions as their arguments, and also guarantee that the result of applying these constructors will always be an integer expression.

GADTs can also be used to define types that are parameterized by other types, allowing for even more precise type information.

This is quite good IMHO.

My questions to you:

  • Do you see any problems with ChatGPT’s text as a small summary?
  • The example code is taken from neither “Real World OCaml” nor the OCaml manual. Does anybody know the source of this? Maybe it translated an example for another programming language? ChatGPT couldn’t come up with this on it’s own, could it?
1 Like

Since the GADT example is exactly equivalent to the ADT one, this is neither an explanation nor an example of a GADT.

Even the ADT example is not describing a parametric type constructor, muddling the frontier between
GADTs and ADTs.
Typically, remark that the “explanation”

cannot be painlessly be generalized to the very simple sum type:

type 'a t = X of 'a

In general, the finer details are also mostly false.

This is false: type constructors have nothing to do with GADTS.

This is also false. The _ symbol is only used because the name of the type argument in the type constructors is irrelevant.

My opinion if that if someone knew nothing about GADTs before reading this explanation, they will still know nothing about GADTs; but a nothing which is now a mixture of falsehood and correct but vague sentences.


It’s not a very clear explanation. For example, it says that “a GADT allows us to define different types for each constructor,” “[i]n contrast” with the preceding example of a non-GADT variant type. But the non-GADT example is manifestly one in which the constructors have different types — as the immediately preceding sentence spells out in detail!

And so on. For example, is it GADTs that allow us “to define types that are parameterized by other types”?

So ChatGPT demonstrated indirectly, that my knowledge of GADTs is deficient.

Thank you!

It is probably more a question of lack of experience in explaining GADTs.
Typically, if I wanted to explain GADTs with a typed expression DSL, I would amend the example to

type expr =
  | Int of int
  | Bool of bool
  | Plus of expr * expr
  | If_then_else of expr * expr * expr
  | Equal of expr * expr

Then the GADT version

type 'ty expr =
| Int: int -> int expr
| Bool: bool -> bool expr
| Plus: int expr * int expr -> int expr
| If_then_else: bool expr * 'a expr * 'a expr -> 'a expr
| Equal: 'a expr * 'a expr -> bool expr

is suddenly enforcing semantic constraints on expressions. This also allows to compare this solution to a strongly typed one based on ordinary variants:

type bool_expr =
| Bool of bool
| Bool_If_then_else of bool_expr * bool_expr
| Bool_equal of bool_expr * bool_expr
| Int_equal of int_expr * int_expr
and int_expr =
| Int of int
| Plus of int_expr * int_expr
| Int_if_then_esle of bool_expr * int_expr * int_expr

which end up splitting the type in two (boo_expr and int_expr) and duplicating the If_then_else and Equal constructor in two constructors. Without this contrast between the three implementations of the same idea, I feel that the comparison doesn’t work as an explanation.