Using GADT to model records that share some of their fields

Let’s say I have some data type t that always has a couple of fields and optionally has a few others. Something like this.

type t = {w: w; x: x; y: y option; z: z option}

Then you could imagine having functions to get the values of those fields.

val w : t -> w
val x : t -> x
val y : t -> y option
val z : t -> z option

I wanted to see if I could set it up so that the y and z functions could only be called on data that actually has something in y and z fields rather than None. So I set up a GADT that takes records with only the present data, and then tracks whether y and z are present or not.

  (* To track if y and z are present *)
  type no = No
  type yes = Yes

  type w = W
  type x = X
  type y = Y
  type z = Z

  type ('has_y, 'has_z) t =
    | Wx : {w: w; x: x} -> (no, no) t
    | Wxy : {w: w; x: x; y: y} -> (yes, no) t
    | Wxz : {w: w; x: x; z: z} -> (no, yes) t
    | Wxyz : {w: w; x: x; y: y; z: z} -> (yes, yes) t

  let w : type a b. (a, b) t -> w = function
    | Wx v -> v.w | Wxy v -> v.w | Wxz v -> v.w | Wxyz v -> v.w

  let x : type a b. (a, b) t -> x = function
    | Wx v -> v.x | Wxy v -> v.x | Wxz v -> v.x | Wxyz v -> v.x

  let y : type a. (yes, a) t -> y = function 
    | Wxy v -> v.y | Wxyz v -> v.y

  let z : type a. (a, yes) t -> z = function 
    | Wxz v -> v.z | Wxyz v -> v.z

And that compiles and works okay…the compiler doesn’t let you call y and z on variants that don’t have that data present. But, if instead of only two optional fields there were more, this way could get unwieldy pretty quickly. So I was wondering if there was a better way to encode this.

Some other ideas I had:

  • Don’t bother trying to “connect” the different variants in a GADT, just make them all separate types with specific functions for those types.
  • Use objects or classes and take advantage of row polymorphism or subtyping/mixins (no idea how that would look as I’ve never used the “O” in OCaml)

The xy problem is basically this: I have some external service that returns data that is more or less the same in most of the fields and differs in a few others. Eg, A, A', A'', ..., with each of those data types sharing many of their fields, and other sets B, B', B'', ... sharing most of their fields, and so on. And I would like to deal with them in a nice way.

2 Likes

So here’s an idea. It uses the ‘O’ but only for types. It’s partly inspired by this technique to label type parameters: Nomadic Labs - Labelled type parameters in OCaml

First your base setup

(* Your setup *)
type yes = Yes
type no = No

type w = W
type x = X
type y = Y
type z = Z

Then a little bit more prep: an option type but with yes/no tracking:

(* option but tracking none/some in the type system *)
type (_, _) o =
  | None : (no, _) o
  | Some : 'a -> (yes, 'a) o

Finaly, the core of the solution: a single-entry GADT with an object type tracking the yes/no status of each of the optional fields

type _ t =
  | T : { w: w; x: x; y: ('oy, y) o; z: ('oz, z) o } -> < y: 'oy; z: 'oz > t

You can easily define an accessor for each of the optional fields. This accessor doesn’t mention anything other than the optional field it is concerned with. This mean that the verbosity of the accessor is constant even if the number of optional fields grow.

let get_z
  : < z: yes; .. > t -> z
  = fun (T { z = Some v; _ }) -> v

Now a little demo showing the accessor cannot be applied on values that don’t have a z but can be safely applied to values with one.

let wx = T { w = W; x = X; y = None; z = None; }

(* THIS CAUSES a type error
let _ = get_z wx
*)

let wxz = T { w = W; x = X; y = None; z = Some Z; }

(* THIS IS FINE *)
let _ = get_z wxz

And another demo showing you can mix and match all the Ts with a simple existential and you can

(* an existential *)
type anyt = Any : _ t -> anyt

(* some arbitrary list for demo purposes *)
let anyts = [
  Any wx;
  Any wxz;
  Any wxz;
  Any wxz;
  Any wx;
  Any wx;
  Any wxz;
]

(* prints the expected number*)
let () =
  let c = ref 0 in
  List.iter
    (fun (Any ( T { z ; _ })) -> match z with Some Z -> incr c | None -> ())
    anyts ;
  Format.printf "There are %d Zs" !c
4 Likes

This is a bit hacky, but I think all the blowups are O(n) and none are 2^n.

type core = {
  w: w;
  x: x;
}

module Has_Core : sig
  type t
  val get_core: t -> core
  val set_cored: t -> core t
end

module Has_Y: sig
  type t
  val get_y: t -> y
  val set_y: t -> y -> t
end

module Has_Z: sig
  type t
  val get_z: t -> z
  val set_z : t -> z -> t
end

The downside of course, is now for all the ‘generic’ over msg functions, we will have to use the ugly setter/getters for the optional fields.

1 Like

@raphael-proust : once you have lifted the option to the type level with

type nothing = private Nothing_tag
type just = private Just_tag
type ('a,'b) opt =
  | Nothing: ('a, nothing) opt
  | Just: 'a -> ('a, just) opt

it is possible to keep the direct definition of the record with a constraint, rather than adding another layer of GADTs

type x type y type z type w
type 'p r = {
  x:x;
  y:y;
  z:(z, 'z_status) opt;
  w:(w, 'w_status) opt
}
constraint 'p =  <z:'z_status; w:'w_status>

then the get on opt is a total function that we can define once

let get (Just x) = x

while still being able to write functions that work with either Just or Nothing:

let f (type a) (x: <z:a; ..> r) =
  get x.w,
  match x.z with
  | Just z -> Some z
  | Nothing -> None

In particular, it also means that we can validate once the presence of a field:

type (_,_) eq = Refl:('a,'a) eq

let present (type a elt) (x: (elt,a) opt) : (a,just) eq option =
match x with
  | Just _ -> Some Refl
  | Nothing -> None

let test (type a b) (r: <z:a; w:b > present) = 
  match present r.w, present r.z with
  | Some Refl, Some Refl −>
    (* in this branch and only in this branch,
       we know that `w` and `z` are present *) 
    Some (get r.w, get r.z)
  | _ -> None

An important point is that here we are using an object type to (ab)use the fact that “method” names are commutative and we don’t have to remember the position of the z or w “method” in the object type.

8 Likes

Thanks for the suggestions everyone! I will try them out and see how it goes.