Hello,
I am trying to define a language in the style of the tagless-final interpreters of Carett, Kiselyov and Shan (my goal is to perform tagless-final abstract interpretation, as described in a forthcoming PLDI paper). I would like the language to be extensible: new operations, but also new types can be added.
This is an exemple of signature for such an extensible language.
type 'a typ = ..
type boolean = |
type 'a typ += Boolean: boolean typ
type 'a var_identifier = 'a typ * int
module type Lang_Base = sig
type 'a value
val true_: boolean value
val not: boolean value -> boolean value
type store
val var: 'a var_identifier -> store -> 'a value
val assign: 'a var_identifier -> 'a value -> store -> 'a value
end
type integer = |
type 'a typ += Integer: integer typ
module type Lang_With_Int = sig
include Lang_Base
val const: int -> integer value
val add: integer value -> integer value -> integer value
val leq: integer value -> integer value -> boolean value
end
How to implement this? One possibility would be to use GADTs:
module Lang_Base_Interp(* :Lang_Base *) = struct
type !'a value = ..
type 'a value +=
| BoolVal: bool -> boolean value
let true_ = BoolVal true
let not ((BoolVal x):boolean value) = (BoolVal (not x))
type store = .. (* A polymorphic map *)
let var _ = assert false
let assign _ = assert false
end
module Lang_With_Int_Interp:Lang_With_Int = struct
include Lang_Base_Interp
type 'a value +=
| IntVal: int -> integer value
let const i = IntVal i
let add (IntVal a) (IntVal b) = (IntVal (a + b))
let leq (IntVal a) (IntVal b) = (BoolVal (a <= b))
end
There are several problems here: we need to insert BoolVal and IntVal conversions every where, all values are boxed, we have non-exhaustive pattern matching errors everywhere (and the code does dynamic checks).
An alternative solution based on equality witnesses, is this one:
type ('a,'b) eq = Eq: ('a,'a) eq
let cast (type a) (type b) (Eq: (a,b) eq) (a:a) : b = a;;
module Lang_Base_Interp2(* :Lang_Base *) = struct
type 'a value
let to_bool: (boolean value,bool) eq = Obj.magic Eq
let from_bool: (bool,boolean value) eq = Obj.magic Eq
let true_ = cast from_bool true
let not x = cast from_bool @@ not @@ cast to_bool @@ x
type store = unit (* A polymorphic map *)
let var _ = assert false
let assign _ = assert false
end
module Lang_With_Int_Interp2:Lang_With_Int = struct
include Lang_Base_Interp2
let to_int: (integer value,int) eq = Obj.magic Eq
let from_int: (int,integer value) eq = Obj.magic Eq
let const i = cast from_int i
let add a b = cast from_int @@ (cast to_int a) + (cast to_int b)
let leq a b = cast from_bool ((cast to_int a) <= (cast to_int b))
end
This is a manual encoding of an extensible type family constructor (I think, in Haskell, this could be encoded directly as
data family Value a
data instance Value Integer = Int
data instance Value Boolean = Bool
It solves some of the problems: the values are now unboxed, and the code no longer performs any dynamic check and the generated code is as performant as can be. There are still two problems though: the use of Obj.magic for the equality proof, and the need to use cast operations. Ideally we would just add a “constraint boolean value = bool constraint integer value = int
”, but I don’t think that we can do this (even with Obj).
My question is the following: is there a better way to implement this use case?
The paper “Unboxing Mutually Recursive Type Definitions in OCaml” ( [1811.02300] Unboxing Mutually Recursive Type Definitions in OCaml ) by Colin, Lepigre and Scherer mentions a possible solution (unboxing GADT with several constructors, although this would not solve the pattern-matching error problem due to extensibility) and they conclude by “It seems fishy to only allow pattern-matching on partially-determined instances of the type, and we did not investigate further.” Maybe the above provides a compelling use case for such a feature?