GADTs and typed environments

The Writing an interpreter with OCaml GADTs finishes with

It is possible to have a typed interpreter that avoids the type reconstruction (and the string comparison!) by enforcing the correspondence between variable indices and environments at the type level, but that gets complicated.

Does anyone have an example?

I’m rewriting my compiler from Zig to OCaml and need to implement value environments to know the expression that variables hold at any given moment. I use this for inlining.

You probably can take a look on this code that I made a long time ago: mirage-lambda/src/typedtree.ml at master · mirage/mirage-lambda · GitHub It’s a typed lambda calculus with a typed environment. You can also see the not so beautiful way to transform a simple AST to the GADT (with all proofs) here: mirage-lambda/src/typedtree.ml at a89b265b552f8b63ff725fc942f41a276fabb4f5 · mirage/mirage-lambda · GitHub

2 Likes

I came up with the following. It works fine if I need to look up an id, know the type and need to fetch the value.

Is there a way to fetch the value by name, without knowing the type? The use case is finding an id when parsing. The type is not known at that point but a value may exist in the environment.

My find implementation below doesn’t work since I can only find out if the value exists and cannot unwrap it. Is there a better way than enumerating all the types with a given name by calling Type_env.lookup repeatedly?

module Type_env = struct
  type 'n t =
    | Nil
    | Cons : string * ('a, 'n) Type.t * 'a * 'n t -> 'n t

  type ('a, 'b, 'na, 'nb) eq = Refl : ('a, 'a, 'na, 'na) eq

  let make () = Nil

  let types_equal
    : type a b n. (a, n) Type.t -> (b, n) Type.t -> (a, b, n, n) eq option
    =
    fun a b ->
    let open Type in
    match a, b with
    | Register, Register -> Some Refl
    | Register_index, Register_index -> Some Refl
    | Bit_field, Bit_field -> Some Refl
    | Macro, Macro -> Some Refl
    | Memory_region', Memory_region' -> Some Refl
    | _, _ -> None
  ;;

  let append ~env ~name ~ty ~value = Cons (name, ty, value, env)

  let rec lookup : type a n. n t -> string -> (a, n) Type.t -> a option =
    fun env name ty ->
    match env with
    | Nil -> None
    | Cons (xname, xty, value, rest) ->
      if name = xname then (
        match types_equal ty xty with
        | Some Refl -> Some value
        | None -> assert false
      ) else
        lookup rest name ty
  ;;

  type 'n value = Value : ('a * ('a, 'n) Type.t) -> 'n value [@@unboxed]

  let rec find : 'n t -> string -> 'n value option =
    fun env name ->
    match env with
    | Nil -> None
    | Cons (xname, ty, value, rest) ->
      if name = xname then
        Some (Value (value, ty))
      else
        find rest name
  ;;
end

First, I would advise to use the standard library Type.eq type:

  let types_equal
    : type a b n. (a, n) Type.t -> (b, n) Type.t -> (a, b) Stdlib.Type.eq
    =
    fun a b ->
    let open Type in
    match a, b with
    | Register, Register -> Some Eq
    | Register_index, Register_index -> Some Eq
    | Bit_field, Bit_field -> Some Eq
    | Macro, Macro -> Some Eq
    | Memory_region', Memory_region' -> Some Eq
    | _, _ -> None

If you need to catpure more equalities, you can use tuple types:

type _ t = Int: int t | Char: char t
let (%=%) (type a b c) (x:a t) (y:b t) (z: c t):
 (a * a, b * c) Type.eq option =
match x, y, z with
| Char, Char, Char -> Some Type.Equal
| Int, Int, Int -> Some Type.Equal
| _ -> None

Second your find function works? It is just equivalent of returning a regular ADT type

type s = Register of ... | Register_index of ...

And it it not possible to do better without more static type information.
In particular, calling multiple times the lookup function is the same thing as pattern matching over the ty tag of a value.

1 Like

You must prove the link between a witness of your variable in the form of a type and what is available in an environment. In the example of mirage-lambda, we use De Bruijn indices (so these are not names) because they can be encoded under a GADT (Peano numbers).

In addition, this allows us to know the position of the variable’s value in a list whose size is also encoded with Peano numbers.

1 Like

The find function works but gives me back an opaque type I can’t do anything with because of the existential type bound by the constructor Value.

I was hoping there was to look up the value by name, without knowing the type in advance.

Thanks for the tip with the standard library witness!

I was hoping to avoid traversing the type environment multiple types.

If you have a pair of types

type _ tag =
| Int: int tag
| Float: float tag
type handwritten_sum = Box: 'a tag * 'a -> handwritten_sum

Then a value of type handwritten_sum is equivalent to a value of the corresponding ADT

type sum = 
| R_int of int 
| R_float of float

let to_sum = function
| Box(Int,x) -> R_int x
| Box(Float,x) -> R_float x

In other words the handwritten_sum is reconstructing by hand the usual discriminate union encoding of ADT by wrapping together an integer tag (with a type) with the content of the union.

1 Like

I understand this now, thanks!

I don’t think it helps me figure out the type of an identifier I find while parsing, though.

I can implement the type environment with plain ADTs but I was hoping there’s a GADT solution since GADTs let me avoid unwrapping values by hand.

How does OCaml implement typed environments?

Can someone please point me to the right place in the source tree?

OCaml has one namespace by kind of objects in the environment, and the grammar always allow to identify the kind of an identifier.

Is there a way to improve this cascade of matches?

module Output = struct
  include Output'

  let lift env (piece : 'a S.Display.piece) =
    match piece with
    | Text id -> Text id.value
    | Id id ->
      (match Type_env.lookup !env name Type.Register with
       | Some value -> Register value
       | None ->
         (match Type_env.lookup !env name Type.Register_index with
          | Some value -> Register_index value
          | None ->
            (match Type_env.lookup !env name Type.Bit_field with
             | Some value -> Bit_field value
             | None ->
               (match Type_env.lookup !env name Type.Scanner with
                | Some value -> Bit_field value
                | None ->
                  error (tag id) (Printf.sprintf "Unknown id '%s'" id.value)))))
  ;;
end

My first thought was iterating through a list of types to match but that doesn’t work since the list expects to be homogenous.

Build a Type.t list module out of Nil and Cons, just like TypeEnv?

Overall, it doesn’t sound like GADTs are great for implementing typed environments, not when the environment is indexed by the identifier name.

It looks like I should just byte the bullet and re-implement with ADTs, use a string Map and check the type to see if I got the right identifier.