Does using object types in GADTs get you any more power?

I recently learned that you could use object types in GADTs! In this example it’s just used as a “type-level record”, which if I understand correctly means that it isn’t much more powerful that using * or -> types, with the only real difference being that you can name the types.

On first blush this seemed like a very powerful technique to me—I vaguely understand that OCaml’s type system for objects is pretty different from the rest of the language, with objects having row polymorphism / subtyping. But upon experimentation I couldn’t find any way of using this: every weird GADT I could come up with could be replicated only using * types.

Is this technique just not as powerful as I thought, or am I missing something?

1 Like

I found watching three minutes of https://youtu.be/pO8Z4lcY0ys?si=AQ1xqRsqCmJ5iZ4e&t=470 helpful (7:50 to approximately 11:00). That video link came from A pattern for using GADTs to subset cases - #2 by octachron

1 Like

As explained in the talk of @AltGr about Catala, this helps when e.g. you have two or more ASTs that are quite close but you don’t want to:

  • duplicate code on them (e.g. a printer);
  • don’t want to merge them a single ASTs because it implies having to deal with case that should have disappear after some steps.

This is done through sub-typing which you can get with GADTs + either object types or polymorphic variants. It appears that object types are more pleasant to use in this case (for instance, error messages are better).

I recently started to port the code of Owi with this technique and was surprised about how well it went. IIRC, one thing that is not mentioned in the talk but that is quickly needed is to use as 'a.

Louis also gave a talk at OUPS and as an answer, at the next meetup, Boris Yakobowski gave another talk about an alternative they use at Adacore. It is in the case of their compiler where they have 48 micro-steps, each one creating a different type of ASTs. One notable thing is the use of a ppx to help. We are going to put the video online at some point here (but the talk was in french).

4 Likes

GADTs are incompatible with object subtyping: they are always invariant with respect to their type parameters. For instance,

type +'a opt = None | Some: <x:'a> opt

fails with

Error: In this GADT definition, the variance of some parameter
cannot be checked

because GADTs are concerned with equality, and subtyping coercion and equality relationship don’t play nice together.

The use of object types in Catala is purely optional, but using object types at the type-level record provides you:

  • named type parameter <x:int; .. >
  • object type definition inlining:
type t = <x:int; y:int>
type more = <t; z:int >
  • row variable polymorphism for constraints that work if you keep them far away from GADT equations

In other words, object types afford you a better programming experience at the type level but they don’t give you more expressiveness power.

And the brittleness of the last point is the reason why using object type is often better than polymorphic variant for this use case, since object types makes it easier to not introduce long-lived row type variable that will clash later with a GADT equation.

3 Likes

GADTs are incompatible with object subtyping

Sure, I did not say they are. :slight_smile:

I probably used the word sub-typing incorrectly but isn’t row polymorphism a form of sub-typing, as explained in Structural Subtyping as Parametric Polymorphism for instance ?

Row parametric polymorphism can be quite similar to subtyping. However, the presence of GADT magnifies the difference between the two, because GADTs equations will often freeze the row type variable at the most inopportune time, and break the expected subtyping relationships.

Consider for instance, the toy language:


type _ expr =
  | Int: int -> <bool_like:yes; num:yes; value:yes; ty:int > expr
  | Float:
      float ->
      < bool_like:no; num:yes; value:yes; ty:float > expr
  | If:
     <bool_like:yes; num:_; value:_; ty:_ > expr
    * (<bool_like:'b; num:yes; value:_;ty:'ty> as 'e) expr
    * 'e expr
    -> <bool_like:'b; num:yes; value:no; ty:'ty > expr
  | Add:
      (<num:yes; bool_like:'b; value:_; ty:'ty> as 'e) expr
      * 'e expr
      -> <num:yes; bool_like:'b; value:no; ty:'ty> expr

One can write an interpreter to the syntactic values of this language with

let rec value: type b n v ty.
    < bool_like:b; num:n; value:v; ty:ty> expr
    -> <bool_like:b; num:n; value:yes; ty:ty > expr =
  function
  | Int _ as i -> i
  | Float _ as f -> f
  | If (cond,l,r) ->
     let cond = match value cond with
       | Int x -> x <> 0
       | _ -> .
     in
     if cond then value l else value r
  | Add (x,y) ->
      match value x, value y with
     | Int x, Int y -> Int (x+y)
     | Float f, Float g -> Float (f+.g)
     | _ -> .

Looking at the definition of If, one might be tempted to simplify it to:

  | If:
     <bool_like:yes; .. > expr
    * ...

since what matters is that the type is bool_like. However, this is not equivalent simplification, and the previously correct value cannot be typechecked anymore

   |      let cond = match value cond with
                                 ^^^^
Error: This expression has type < bool_like : yes; .. > expr
      but an expression was expected of type
        < bool_like : 'a; num : 'b; ty : 'c; value : 'd > expr
      Type $a is not compatible with type < num : 'b; ty : 'c; value : 'd >
      Hint: $a is an existential type bound by the constructor If.

The mysterious error message is due to the fact that in the new definition of If

  | If:
     <bool_like:yes; .. > expr

the row type variable is now an existentially quantified type variable. And such existentially quantified row type variable cannot be extended with fields anymore. Thus when we extract the cond term in

  | If (cond,l,r) ->
     let cond = match value cond with
       | Int x -> x <> 0
       | _ -> .
     in
     if cond then value l else value r

we are not allowed to inspect any of the other type tag of the condition, which fails because value require all type tags to be available.

In this case, we can recover the simplification by making sure that we don’t have an existentially quantified row variable with:

type 'a tagged = 'a
  constraint 'a = <bool_like:'b; num:'n; value:'v; ty:'ty>
...
  | If:
     <bool_like:yes; .. > tagged expr

But this means that we are keeping away from any subtyping through row type variable.

In other words, subtyping through row type parametric polymorphism doesn’t work reliably in presence of GADTs, and my advice is to avoid it at the start rather than eliminate it later on after many fights against existentially quantified row type variables.

4 Likes

I’m not sure the problem is really due to GADT, but to a bad use of subtyping (so the error message).

Consider these two types:

type 'a t = <bool_like:yes; ..> as 'a
type a = Any : 'a t -> a

The last one is equivalent to the closed object type:

type b = <bool_like:yes>

What we got with row type polymorphism ('a t) is that we know we got a subtype 'a of b. And in the code of value we need a subtype of <bool_like:yes; num:'b; ty:'c; value: 'd>. But, from a subtyping point of view, we only know that some b (or a) are of this type, and not all of them. Hence the type-checker, from a subtyping point of view, can’t prove that 'a is of the right type.

Edit: Row type polymorphism can be use with GADT, under the condition that the row type varibale appears in the type parameters of the GADT. Otherwise, when you hide the row type under an existential, you are upcasting your value in a closed object type (so you don’t really need row type polymorphism) and, therefore, you can’t specialize or downcast it anymore.