GADTs & ASTs with polymorphic containers

Hi !

I work with a small language which has a literal type in its AST that’s similar to this (I removed most cases because they’re not relevant):

type literal =
  | Null
  | Bool of bool
  | Num of float
  | LList of literal list

I don’t think I need to spend too much time interpreting the different cases.

Because ASTs are the textbook example for GADTs, trying GADTs here has been in my todo list for quite a while.

So I did this:

type null 
type llist (* not sure about those phantom types *)

type _ literal =
  | Null : null literal
  | Bool : bool -> bool literal
  | Num : float -> float literal
  | LList : ??? -> llist literal

The issue is that LLists are polymorphic. They can contain anything, so I don’t know how to express it.
One might think that adding ADTs has little meaning here but I think there are some nice things to do.
First of all, my language has expressions, and I would want to have a function with the following signature

val eval_expr : type a. a expr -> a literal

Which brings me to my second question, what’s the type of a variable in an AST ?

type _ expr = 
   | Lit : 'a literal -> 'a expr
   | Plus : float expr -> float expr -> float expr
   | ListCons : 'a expr -> llist expr -> llist expr
   | Var : string -> 'a expr (* ??? *)

Moreover, my language also has (logic) formulae which are basically boolean expressions with quantifiers.

Right now, we have a function with the following signature:

val lift_expr : expr -> formula

Which has a lot of failing cases, and I would like to use GADTs to have something more like

val lift_expr : bool expr -> formula

And that would even allow me to rewrite the formula type as something similar to this:

type formula =
   | QF of bool formula
   | Forall of var list * formula

instead of having formula being basically a duplicate of expr with less cases and quantifiers.

I don’t know how clear that is, but to summarise, here are my two questions:

  • In that GADT literal type, what should be the type of the LList constructor to accept any kind of literal list ?
  • In an AST that has variables which are dynamically typed, what would be the definition for a nice GADT ?
    And a more global question: are GADTs even interesting in my case ?

Thank you ! :slight_smile:

1 Like

Hi @giltho,

In answer to your first question, the “direct” translation of your literal type into one that is typed (except for lists) might look as follows:

type null = |  (* Aside: empty types > abstract ones when possible: you *)
type llist = | (* probably want the type-checker to know [null <> bool]. *)

type _ literal =
  | Null  : null literal
  | Bool  : bool -> bool literal
  | Num   : float -> float literal
  | Llist : some_literal list -> some_literal list literal

and some_literal = E : _ literal -> some_literal [@@unboxed]

The trick use here is to use an existential wrapper E to hide the type of each element of the list separately, allowing the construction of a list of potentially-different literal types. In this case, some_literal is effectively recovering the behaviour of your original untyped literal type from the new typed one. This obviously restricts the kinds of properties you can enforce; it’s also possible to go the route of heterogeneous lists if you do want to propagate element types.

Regarding the Var constructor, I recommend starting by thinking about the functions you want to support – and what static properties you want to enforce about these functions – then working backwards from there to decide what your type parameters will need to be. (This pattern of using type parameters to implement individual use-cases is why such types tend to accumulate more and more type parameters as you add more functions over them… Looking at you, format6.) In this case:

val eval_expr : 'a expr -> 'a literal

We’re going to require that a Var expression has a type parameter that determines its type once evaluated to a literal. Since your language has only dynamically-typed variables, we don’t know what the corresponding literal type will be, and so your AST needs to reflect this ambiguity somehow:

type untyped = |

type _ literal =
  | Any : _ literal -> untyped literal 
     (* New case: really just the same existential trick as before, 
        but now directly in the [literal] type. When our [eval]
        finds the actual type at runtime, it can hide it here. *)

  | Null : null literal
  | Bool : bool -> bool literal
  (* etc. *)

type _ expr = 
  (* ... *)
  | Var : string -> untyped expr

(The exact details will depend on how your evaluator works.) As an aside, it’s also possible to have statically-typed variables in an AST like this by using (typed) runtime type representations. Then each variable would come along with a representation of its type, and the evaluator will be forced to preserve this type when substituting the variable:

type _ expr = 
  (* ... *)
  | Var : string * 'a type_rep -> 'a expr

(This avenue opens up many more rabbit holes of type equality witnesses and heterogeneous maps etc. needed to implement a type-safe evaluator.)

are GADTs even interesting in my case ?

This is ultimately up to you :slight_smile: If you find that your current implementations are overwhelmed with exceptions and assert false cases, it makes sense to me to scrap some of that via GADT encoding. The presence of dynamically-typed variables and untyped lists in your embedded language sets limits on what you can do, but it’s very possible to apply GADTs in moderation and get some nice benefits without falling into the “Everything Must Be Statically Encoded” trap – down that path lie dragons and frustration.

Hope some of this helps!


Hi @CraigFe ,
Thank you so much for your answer it’s quite detailed !

My conclusion is probably that we won’t do that in our code. A lot of legacy code that’s going to too hard to update, and more importantly, I’m using Visitors and I don’t think it’s going to work with all this.

That being said, I’m very intersted, your answer really clarifies the “what if I rewrote it from scratch” scenario as a thought experiment. I wasn’t aware, of empty types, and the Any trick in the literal AST is very cool !

Thank you :slight_smile:

1 Like