I’ve just found a nice trick related to extensible types (type t = ..) and I’m not sure whether it is known or not. As for me, I have not seen it.
So I will demo this trick with a toy example. An eval for an extremely simple AST.
I will start with the following type:
type t = ..
type t +=
| Int of int
and define an eval function for it. Later I will add Add and Mul constructors to that type and redefine eval, but I will pattern-match only on added constructors and re-use the already defined evals.
And yes, we have to write the eval in a special way
let eval ?next t = match t with
| Int n -> n
| _ -> match next with
| None -> failwith "."
| Some f -> f t
Adding additions
type t +=
| Add of t * t
let eval_v1 = eval
let rec eval ?next t =
let next = Option.value next ~default:(eval ?next:None) in
match t with
| Add (e1, e2) -> eval ~next e1 + eval ~next e2
| _ -> eval_v1 ~next t
Adding multiplications
type t +=
| Mul of t * t
let eval_v2 = eval
let rec eval ?next t =
let next = Option.value next ~default:(eval ?next:None) in
match t with
| Mul (e1, e2) -> eval ~next e1 * eval ~next e2
| _ -> eval_v2 ~next t
And that’s it.
PS. Edited to add the missed case. Thanks @brandon !
It looks like one line was cut off, and the base eval ends up with a non-exhaustive pattern matching. Here it is with the missing case added back:
let eval ?next t = match t with
| Int n -> n
| _ -> match next with
| None -> failwith "."
| Some eval -> eval t
The chaining is more visibly immediate to me when the default value is given at the parameter. It didn’t click with me until I wrote it this way.
type t = ..
type t += Int of int
let eval ?(next = fun _ -> failwith ".") = function
| Int n -> n
| t -> next t
type t += Add of t * t
let eval_v1 = eval
let rec eval ?(next = eval ?next:None) = function
| Add (e1, e2) -> eval ~next e1 + eval ~next e2
| t -> eval_v1 ~next t
type t = ..
type t += Int of int
let eval ?(next = fun _ -> failwith ".") = function
| Int n -> n
| t -> next t
type t += Add of t * t
let eval_v1 = eval
let rec eval ?(next = eval ?next:None) = function
| Add (e1, e2) -> eval ~next e1 + eval ~next e2
| t -> eval_v1 ~next t
type t += Mul of t * t
let eval_v2 = eval
let rec eval ?(next = eval ?next:None) = function
| Mul (e1, e2) -> eval ~next e1 * eval ~next e2
| t -> eval_v2 ~next t
In a real setting if you are going to bother using it, every eval will be defined in its own file/module and you will use SomeM.eval not eval_vN
Also some tests.
module Test = struct
let _1, _2, _3, _4, _5 = Int 1, Int 2, Int 3, Int 4, Int 5
let ( + ) e1 e2 = Add (e1, e2)
let ( * ) e1 e2 = Mul (e1, e2)
let _ = assert (_1 + _2 + _3 + _4 + _5 |> eval = 15)
let _ = assert (_1 + _3 * _5 |> eval = 16)
let _ = assert (_5 * _4 + _3 |> eval = 23)
let _ = assert (_5 * (_4 + _3) |> eval = 35)
let _ = print_endline "Tests passed."
end
Nice trick!
Here is an alternate implementation, based on ideas from the OOP world:
type t = ..
type t += Int of int
type eval_self =
| Self of (eval_self -> t -> int)
let eval (Self _self) = function
| Int n -> n
| _ -> failwith "."
type t += Add of t * t
let eval (Self self) = function
| Add (e1, e2) -> self (Self self) e1 + self (Self self) e2
| t -> eval (Self self) t
type t += Mul of t * t
let eval (Self self) = function
| Mul (e1, e2) -> self (Self self) e1 * self (Self self) e2
| t -> eval (Self self) t
(* Final binding: close the loop *)
let eval = eval (Self eval)
Compared to your implementation, mine always goes back immediately to the final version in recursive cases, which makes it possible to reliably override earlier implementations (there isn’t much that you could override in this tiny example, but you could consider for instance forbidding negative numbers or adding overflow checks).
Both versions cannot reliably handle multiple parallel extensions though: if two modules extend the type independently, and do not expose the actual new constructors (introducing functions to build new values instead), you can create expressions that use both extensions but getting an evaluation function supporting both extensions is going to be hard.
With my proposal, one possibility would be to use a dedicated exception in the initial function and catch it in a wrapper:
exception Unknown_constructor
let eval (Self _self) = function
| Int n -> n
| _ -> raise Unknown_constructor
module M1 = struct (* ... *) end
module M2 = struct (* ... *) end
module M3 = struct
let eval (Self self) t =
try M1.eval (Self self) t with
| Unknown_constructor -> M2.eval (Self self) t
end
This is not very satisfying though, as if M1 and M2 override existing constructors only M1’s overridings will be taken into account.
Yes, this is for example how cmarkit’s AST mapper and folder abstractions handle AST extensions (see the *_ext_default fields for handling the defaults folding of extensions, note this line is misindented).
In OCaml I would trace at least trace it to this 2000 paper by Jacques Garrigue which tries to tackle the famous expression problem (using polymorphic variants there).
I totally agree that this scheme is not well suited to “diverging and converging” extension. So be it.
As for your approach I guess it can be simplified if we do not insist that everything is called eval
type t = ..
type t += Int of int
let evalInt _eval = function
| Int n -> n
| _ -> failwith "."
type t += Add of t * t
let evalAdd eval = function
| Add (e1, e2) -> eval e1 + eval e2
| t -> evalInt eval t
type t += Mul of t * t
let evalMul eval = function
| Mul (e1, e2) -> eval e1 * eval e2
| t -> evalAdd eval t
(* Tying the knot *)
let rec eval t = evalMul eval t
What matters here, I guess, is the “late-binding” of eval
Speaking of which I remember writing a recursive descent parser in that style, which allowed me avoid (explicit) recursion until the last step. Which turned out to be more pleasant than having a bunch let rec parse_a.. and parse_b ... and .. and parse_z
type t = ..
type t += Int of int
let evalInt _eval = function
| Int n -> Some n
| _ -> None
type t += Add of t * t
let evalAdd eval = function
| Add (e1, e2) -> Some (eval e1 + eval e2)
| _ -> None
type t += Mul of t * t
let evalMul eval = function
| Mul (e1, e2) -> Some (eval e1 * eval e2)
| _ -> None
let rec eval t =
match List.find_map (fun subeval -> subeval eval t) [evalInt;evalAdd;evalMul] with
| Some v -> v
| None -> failwith "."
or even
type t = ..
let subevals = ref []
let add_subeval f = subevals := f :: !subevals
let eval t =
match List.find_map (fun subeval -> subeval t) !subevals with
| Some v -> v
| None -> failwith "."
type t += Int of int
let () = add_subeval @@ function
| Int n -> Some n
| _ -> None
type t += Add of t * t
let () = add_subeval @@ function
| Add (e1, e2) -> Some (eval e1 + eval e2)
| _ -> None
type t += Mul of t * t
let () = add_subeval @@ function
| Mul (e1, e2) -> Some (eval e1 * eval e2)
| _ -> None
Each “subeval” returns Some of the result if it knows how to handle the given term, otherwise returns None.