Not really, they could be used as a part of a solution which is more practical than GADT.
What’s wrong with GADT?
The problem with GADT is that they mix two orthogonal concepts - types and data representation (aka data types). While ADT define data representation, constraints, attached to GADT constructors, are pure types, as their only purpose is to constrain the set to which the type belongs. On one hand it is very nice, we can kill two rabbits at once, construct data and prove its correctness by construction. But this comes with a price because you have to repeat the proof every time you reconstruct your data. In other words, you can’t serialize and deserialize your GADT, as the constraint itself is not serializable (because it is not data).
Therefore, in real life, where you occasionally need to pass your AST around different hosts, store them in cache and files, etc, using GADT immediately becomes a nuisance. As every time when you need to read your AST from a string, you have to repeat the process of proving by construction. In other words, you need to write a typed parser, that does parsing (data processing) and typing simultaneously. This is notoriously hard and, in fact, is only possible in a small number of simple cases. E.g., it is practically1 impossible to reconstruct an arrow type from its string representation,
type 'a app =
| Unit : unit app
| App : 'a * 'b app -> ('a -> 'b) app
Separate the concerns
Therefore, a practical solution would be to separate the concerns back and use ADT for data types, and module types for types. In other words, we will hide our data representation behind an abstraction wall. And we may even use the private
keyword to allow others to peek into our privates through the holes in our abstraction wall.
The approach basically mimics the GADT approach, except that we move the constraints into the module type,
module Types : sig =
type scalar
type vector
type data = private
| Name of string
| Record of (string * data) list
| Array of (data * int)
type 'a t = data
val name : string -> scalar t
val record : (string * _ t) list -> vector t
val array : scalar t -> int -> scalar t
end = struct
type scalar
type vector
type data =
| Name of string
| Record of (string * data) list
| Array of (data * int)
type 'a t = data
let name x = Name x
let record elts = Record (elts)
let array elt sz = Array (elt,sz)
end
Now, our invariant is protected, so that nobody outside of the module could create an incorrect tree. However, inside the module, behind the wall, we are free to do whatever we want, since the constraint is now a phantom type, the compiler will allow us to ascribe any type to any tree. Of course, it will allow us to ascribe wrong types, so it is now our responsibility to implement the type checker correctly. But now we can read our data from text and give it back the type constraint. We can even reify the type constraint into a concrete data type representation and store it with AST.
In this solution we use the private
keyword to expose some of our abstraction. So that a user could use a regular pattern match, however this highlights a small problem. When we turned away from GADT we lost an important feature, as beyond proving the data type is constructed correctly, GADT enables us to use this proof when we deconstruct the data. With our plain ADT representation, which is now impossible to construct incorrectly (using the Types
interface), we still have to deal with an extra case, when we do
function Array (elt,n) -> match elt with
| Name s -> name s
| Array (elt',m) -> array elt m
| Record _ -> assert false
Since we don’t have the constraints anymore, the compiler can’t refute the Record branch (and indeed, an incorrect code inside of the Types
module can construct such representation, so we can’t blame the compiler for that.
And although assert false
is more or less fine here, we don’t want a casual user to rely on our internal invariants (an even know about them because an invariant, that leaks the module is no longer an invariant). The reason why the invarinat escaped is because we exposed our internal representation, we leaked it through the abstraction via the private
keyword. Therefore, the solution is to hide it,
module Types : sig =
type scalar
type vector
type _ t
val name : string -> scalar t
val record : (string * _ t) list -> vector t
val array : scalar t -> int -> scalar t
val case : _ t ->
scalar:(scalar t -> 'a)
vector:(vector t -> 'a) -> 'a
val match_scalar : scalar t ->
name:(string -> 'a) ->
array:(scalar t -> int -> 'a) -> 'a
val match_vector : vector t ->
record:((string * _ t) list -> 'a) -> 'a
end = struct
type scalar
type vector
type data =
| Name of string
| Record of (string * data) list
| Array of (data * int)
type 'a t = data
let name x = Name x
let record elts = Record (elts)
let array elt sz = Array (elt,sz)
let case t ~scalar ~vector = match t with
| Record _ -> vector t
| Array _ | Name _ -> scalar t
let match_scalar t ~name ~array = match t with
| Array (elt,sz) -> array elt sz
| Name s -> name s
| _ -> assert false
let match_vector t ~record = match t with
| Record elts -> record elts
| _ -> assert false
end
We confined our invariant inside our module, and we can use assert false
which could be triggered only if we bugged our code inside of the Types
modules, which is now considered the trusted kernel.
The users of our module now can use the whole power of constraints, and basically, use our typed AST as it was GADTs but without any pain associated with them.
Final Solution, Tagless Final
Finally, we can notice, that cases in our matches totally mimic the types of our branch constructors and this smells like a generalization opportunity. Indeed, we can abstract our abstraction into the module type
(* keep constraints not abstract, and make them provably distinct in
case if someone would like to use GADT in their representations. *)
type scalar = private Scalar_constraint
type vector = private Vector_constraint
module type Types = sig
type _ t
val name : string -> scalar t
val record : (string * _ t) list -> vector t
val array : scalar t -> int -> scalar t
end
(* our old Types module is just _a_ representation, let's call
it an AST *)
module Ast : sig
include Types
val case : 'a t ->
scalar : (scalar t -> 'a) ->
vector : (vector t -> 'a) -> 'a
module Analysis(L : Types) : sig
val scalar : scalar t -> scalar L.t
val vector : vector t -> vector L.t
end
end = struct
type data =
| Name of string
| Record of (string * data) list
| Array of (data * int)
type 'a t = data
let name x = Name x
let record elts = Record (elts)
let array elt sz = Array (elt,sz)
let case x ~scalar ~vector = match x with
| Name _ | Array _ -> scalar x
| Record _ -> vector x
module Analysis(L : Types) = struct
let rec scalar = function
| Name s -> L.name s
| Array (elt,sz) -> L.array (scalar elt) sz
| Record _ -> assert false
let rec vector = function
| Name _ | Array _ -> assert false
| Record ((_, (Name _ | Array _)) :: _ as elts) ->
L.record @@
List.map (fun (name,fld) -> name, scalar fld) elts
| Record elts ->
L.record @@
List.map (fun (name,fld) -> name, vector fld) elts
end
end
And now we can implement the analysis as simple as,
module Sizeof = Ast.Analysis(struct
type 'a t = int
let name _ = 4
let array elt sz = elt * sz
let record elts = List.fold_left (fun s (_,sz) -> s + sz) 0 elts
end)
let sizeof t = Ast.case t
~scalar:Sizeof.scalar
~vector:Sizeof.vector
This approach is called Taggles Final Style and is well-explored.
A few grains of salt
Although it all may look nice (or vice verse), there is a grain of salt. It doesn’t work as we have a problem in our original design, we dropped the type constraint of the record constructor. In other words, our record constructor, (string * 'a t) -> scalar t
is non-injective, as it projects different types into the same type. Therefore we can’t reconstruct the constraint back. Therefore, probably a better type system representation would be
type array = array
type 'a tuple = Tuple
type 'a t
val array : array t -> int -> array t
val tuple : 'a t -> 'b tuple t -> ('a -> 'b) tuple t
val unit : unit tuple t
and the case analysis will look something like this
val case : _ t ->
unit : (() -> 'a) ->
tuple : (('a -> 'b) tuple t -> 'a) ->
array : (array t -> 'a) -> 'a
val split : ('a -> 'b) tuple t -> 'a t * 'b tuple t
1) You can do this for a fixed number of constraints, e.g., int -> unit
, int -> string -> unit
, etc, not for arbitrary constraint. You would need a backtracking parser, which will try on each branch all possible cases. As a result, not only the performance will suffer, but the complexity of the parser itself.