Using GADT to model records that share some of their fields

Let’s say I have some data type t that always has a couple of fields and optionally has a few others. Something like this.

type t = {w: w; x: x; y: y option; z: z option}

Then you could imagine having functions to get the values of those fields.

val w : t -> w
val x : t -> x
val y : t -> y option
val z : t -> z option

I wanted to see if I could set it up so that the y and z functions could only be called on data that actually has something in y and z fields rather than None. So I set up a GADT that takes records with only the present data, and then tracks whether y and z are present or not.

  (* To track if y and z are present *)
  type no = No
  type yes = Yes

  type w = W
  type x = X
  type y = Y
  type z = Z

  type ('has_y, 'has_z) t =
    | Wx : {w: w; x: x} -> (no, no) t
    | Wxy : {w: w; x: x; y: y} -> (yes, no) t
    | Wxz : {w: w; x: x; z: z} -> (no, yes) t
    | Wxyz : {w: w; x: x; y: y; z: z} -> (yes, yes) t

  let w : type a b. (a, b) t -> w = function
    | Wx v -> v.w | Wxy v -> v.w | Wxz v -> v.w | Wxyz v -> v.w

  let x : type a b. (a, b) t -> x = function
    | Wx v -> v.x | Wxy v -> v.x | Wxz v -> v.x | Wxyz v -> v.x

  let y : type a. (yes, a) t -> y = function 
    | Wxy v -> v.y | Wxyz v -> v.y

  let z : type a. (a, yes) t -> z = function 
    | Wxz v -> v.z | Wxyz v -> v.z

And that compiles and works okay…the compiler doesn’t let you call y and z on variants that don’t have that data present. But, if instead of only two optional fields there were more, this way could get unwieldy pretty quickly. So I was wondering if there was a better way to encode this.

Some other ideas I had:

  • Don’t bother trying to “connect” the different variants in a GADT, just make them all separate types with specific functions for those types.
  • Use objects or classes and take advantage of row polymorphism or subtyping/mixins (no idea how that would look as I’ve never used the “O” in OCaml)

The xy problem is basically this: I have some external service that returns data that is more or less the same in most of the fields and differs in a few others. Eg, A, A', A'', ..., with each of those data types sharing many of their fields, and other sets B, B', B'', ... sharing most of their fields, and so on. And I would like to deal with them in a nice way.

So here’s an idea. It uses the ‘O’ but only for types. It’s partly inspired by this technique to label type parameters: Nomadic Labs - Labelled type parameters in OCaml

First your base setup

(* Your setup *)
type yes = Yes
type no = No

type w = W
type x = X
type y = Y
type z = Z

Then a little bit more prep: an option type but with yes/no tracking:

(* option but tracking none/some in the type system *)
type (_, _) o =
  | None : (no, _) o
  | Some : 'a -> (yes, 'a) o

Finaly, the core of the solution: a single-entry GADT with an object type tracking the yes/no status of each of the optional fields

type _ t =
  | T : { w: w; x: x; y: ('oy, y) o; z: ('oz, z) o } -> < y: 'oy; z: 'oz > t

You can easily define an accessor for each of the optional fields. This accessor doesn’t mention anything other than the optional field it is concerned with. This mean that the verbosity of the accessor is constant even if the number of optional fields grow.

let get_z
  : < z: yes; .. > t -> z
  = fun (T { z = Some v; _ }) -> v

Now a little demo showing the accessor cannot be applied on values that don’t have a z but can be safely applied to values with one.

let wx = T { w = W; x = X; y = None; z = None; }

(* THIS CAUSES a type error
let _ = get_z wx
*)

let wxz = T { w = W; x = X; y = None; z = Some Z; }

(* THIS IS FINE *)
let _ = get_z wxz

And another demo showing you can mix and match all the Ts with a simple existential and you can

(* an existential *)
type anyt = Any : _ t -> anyt

(* some arbitrary list for demo purposes *)
let anyts = [
  Any wx;
  Any wxz;
  Any wxz;
  Any wxz;
  Any wx;
  Any wx;
  Any wxz;
]

(* prints the expected number*)
let () =
  let c = ref 0 in
  List.iter
    (fun (Any ( T { z ; _ })) -> match z with Some Z -> incr c | None -> ())
    anyts ;
  Format.printf "There are %d Zs" !c

This is a bit hacky, but I think all the blowups are O(n) and none are 2^n.

type core = {
  w: w;
  x: x;
}

module Has_Core : sig
  type t
  val get_core: t -> core
  val set_cored: t -> core t
end

module Has_Y: sig
  type t
  val get_y: t -> y
  val set_y: t -> y -> t
end

module Has_Z: sig
  type t
  val get_z: t -> z
  val set_z : t -> z -> t
end

The downside of course, is now for all the ‘generic’ over msg functions, we will have to use the ugly setter/getters for the optional fields.

@raphael-proust : once you have lifted the option to the type level with

type nothing = private Nothing_tag
type just = private Just_tag
type ('a,'b) opt =
  | Nothing: ('a, nothing) opt
  | Just: 'a -> ('a, just) opt

it is possible to keep the direct definition of the record with a constraint, rather than adding another layer of GADTs

type x type y type z type w
type 'p r = {
  x:x;
  y:y;
  z:(z, 'z_status) opt;
  w:(w, 'w_status) opt
}
constraint 'p =  <z:'z_status; w:'w_status>

then the get on opt is a total function that we can define once

let get (Just x) = x

while still being able to write functions that work with either Just or Nothing:

let f (type a) (x: <z:a; ..> r) =
  get x.w,
  match x.z with
  | Just z -> Some z
  | Nothing -> None

In particular, it also means that we can validate once the presence of a field:

type (_,_) eq = Refl:('a,'a) eq

let present (type a elt) (x: (elt,a) opt) : (a,just) eq option =
match x with
  | Just _ -> Some Refl
  | Nothing -> None

let test (type a b) (r: <z:a; w:b > present) = 
  match present r.w, present r.z with
  | Some Refl, Some Refl −>
    (* in this branch and only in this branch,
       we know that `w` and `z` are present *) 
    Some (get r.w, get r.z)
  | _ -> None

An important point is that here we are using an object type to (ab)use the fact that “method” names are commutative and we don’t have to remember the position of the z or w “method” in the object type.

Thanks for the suggestions everyone! I will try them out and see how it goes.

Hey, @octachron Florian, why is eq type needed here? It looks like we can do without it here by adjusting present function to return a boolean. I extended your example to separate cases where different combinations of the optional fields are set. I think that the eq type might be used to help separate the cases here.

type nothing = private Nothing_tag
type just = private Just_tag
type ('a, 'b) opt = Nothing : ('a, nothing) opt | Just : 'a -> ('a, just) opt
type x
type y
type z
type w
type q

type 'p r = {
  x : x;
  y : y;
  z : (z, 'z_status) opt;
  w : (w, 'w_status) opt;
  q : (q, 'q_status) opt;
}
  constraint 'p = < z : 'z_status ; w : 'w_status ; q : 'q_status >

type 'q zw_present = < z : just ; w : just ; q : 'q >
type 'z wq_present = < z : 'z ; w : just ; q : just >

let get (Just x) = x

let f (type a) (x : < z : a ; .. > r) =
  (get x.w, match x.z with Just z -> Some z | Nothing -> None)

type (_, _) eq = Refl : ('a, 'a) eq

let present (type a elt) (x : (elt, a) opt) : (a, just) eq option =
  match x with Just _ -> Some Refl | Nothing -> None

type ('q, 'z) tested_info =
  | WQpresent of < q : just ; w : just ; z : 'z > r
  | ZWpresent of < q : 'q ; w : just ; z : just > r
  | Missing

let test (type a b c) (r : < z : a ; w : b ; q : c > r) =
  match (present r.w, present r.q, present r.z) with
  | Some Refl, Some Refl, _ ->
      (* in this branch and only in this branch,
       we know that `w` and `z` are present *)
      WQpresent r
  | Some Refl, _, Some Refl -> ZWpresent r
  | _ -> Missing

If you try, you will see that the test function fails to typecheck on


let is_present (type a elt) (x : (elt, a) opt) =
  match x with Just _ -> true | Nothing -> false

let test (type a b c) (r : < z : a ; w : b ; q : c > r) =
  match is_present r.w, is_present r.q, is_present r.z with
  | true, true, _ ->
      (* in this branch and only in this branch,
       we know that `w` and `z` are present *)
      WQpresent r
  | true, _, true -> ZWpresent r
  | _ -> Missing
49 |       WQpresent r
                     ^
Error: The value r has type < q : c; w : b; z : a > r
       but an expression was expected of type
         < q : just; w : just; z : 'a > r
       The method q has type c, but the expected method type was just

The reason is that the typechecker cannot assume anything after matching a boolean.
Contrarily, when matching on the GADT value Type.(Equal: (a,just) eq), the typechecker knows that we are in a local context where a = just.

Note that we could also match on the field themselves:

let test (type a b c) (r : < z : a ; w : b ; q : c > r) =
  match r.w, r.q, r.z with
  | Just _, Just _, _ ->
      (* in this branch and only in this branch,
       we know that `w` and `z` are present *)
      WQpresent r
  | Just _, _, Just _ -> ZWpresent r
  | _ -> Missing

The advantage of rather using the present function is that we extract just the local type equality that we need from r.w and friends. This is a classical pattern where we use the Type.eq
to extract a local type equality inside a pattern and transport it outside the context.

A case where the useful is more obvious is for GADT-equality:

type 'a ty =
  | Int: int ty
  | Float: float ty
  | Pair: 'a ty * 'b ty -> ('a*'b) ty
type value = V: 'a ty * 'a -> value

let rec equal: type a b. a ty -> b ty -> (a,b) Type.eq option =
  fun x y -> match x, y with
| Int, Int -> Some Type.Equal
| Float, Float -> Some Type.Equal
| Pair (a,b), Pair (x,y) ->
  begin match equal a x, equal b y with
  | Some Type.Equal, Some Type.Equal -> Some Type.Equal
  | _ -> None
  end
| _ -> None

Here the function equal x y either gives us None or a witness that the types of x and y
are the same. We can use it, for extracting a value from a V box :

let get_ty (type a) (ty: a ty) (V(ty',x)): a option = match equal ty ty' with
| Some Type.Eq -> Some x
| _ -> None

We could inline the definition of equal in get_ty but that would be quite cumbersome, and the code would likely end up being duplicated quite a number of times.

Thanks. I verified that your solution works locally .Thanks for clarifying this for me. Is there a book for me to read on classical GADT patterns? @octachron

type 'a ty =
  | Int : int ty
  | Float : float ty
  | Pair : 'a ty * 'b ty -> ('a * 'b) ty

type 'a value = V : 'a ty * 'a -> 'a value

let rec equal : type a b. a ty -> b ty -> (a, b) Type.eq option =
 fun x y ->
  match (x, y) with
  | Int, Int -> Some Type.Equal
  | Float, Float -> Some Type.Equal
  | Pair (a, b), Pair (x, y) -> (
      match (equal a x, equal b y) with
      | Some Type.Equal, Some Type.Equal -> Some Type.Equal
      | _ -> None)
  | _, _ -> None

let get_ty (type a) (ty : a ty) (V (ty', x)) : a option =
  match equal ty ty' with Some Type.Equal -> Some x | _ -> None

let get_value = function V (_, x) -> x

(* *)
let x = Int
let () = ignore x
let y = get_ty Float (V (Float, 1.0))
let () = ignore x
let z = get_ty Int (V (Int, 1))
let () = ignore x
let p = V (Int, 1)
let () = ignore x

(* *)
let q = get_value (V (Int, 1))
let () = ignore x
let r = get_value (V (Float, 1.0))
let () = ignore x
let s = get_value (V (Pair (Int, Float), (1, 1.5)))
let () = ignore x