How would I implement a polymorphic recursive modules for generic types?

Assume the following module signature

module type Show = sig
    type t
    val show : t -> string
end ;;

and now some implementations like

module IntShow: Show = struct
    type t = int
    let show x = //trivial impl
end ;;

module StrShow: Show = struct
    type t = string
    let show x = //trivial impl
end ;;

Ok that is easy.

But now I want also a module that defines a generic type where each/some parts are generic but should be “showeable” as well

module GenTypeShow: Show = struct
    type 'a t = {value: 'a}
    let show x = 
	//not so trivial impl anymore
	let y = someShoweableModule.show x.value
	
end ;;

My biggest issue is how can I make this without having to instantiate a specific module for each of the inner types?
for Example

let a = {value = 1} //OK (int) GenTypeShow.t
let b = {value = "foo"} //OK (string) GenTypeShow.t

but then

let c = {value = {value = 1}} //((int) GenTypeShow.t) GenTypeShow.t
let d = {value = {value = "x"}} //((string) GenTypeShow.t) GenTypeShow.t

and also

let s1 = someShowModule.show c
let s2 = someShowModule.show d

Now I dont want to instantiate a specific module for each deeply nested generic type as this might lead to an explosion of modules

I am sorry if it is not clear what I want. Still very new at OCAML

1 Like

One suggestion for really generic and extensible solution would be to use this library for generic programming in OCaml (or some other conceptually similar implementation). In this particular example it would be something like:

[@@@reify_all]

type 'a poly_val = { value : 'a }

let show_poly_val : type a. a ty -> a -> string = fun t x ->
  match t with
  | Poly_val t -> (* The not so trivial impl to be here *)
                  "Custom! poly_val: " ^ Gfun.show t x.value
  | t          -> Gfun.show t x

let () = Gfun.show_ext (Poly_val Any) { f = show_poly_val }

let p t x = print_endline (Gfun.show t x)

let () =
  p (Poly_val Int) { value = 5 };
  p (Poly_val String) { value = "line" }

Another suggestion can be to resort to object-oriented programming and use classes and open recursion to simulate the approach used in object-oriented languages.

You can not define a generic type type 'a t = {value : 'a} with the condition that the type variable 'a is showable. When you define such a generic type you quantify over all types, but what you want is a bounded quantification: for all type 'a such that 'a is showable…

By the way, we can do something close from what you want using functor (but you’ll have to instantiate the right module in each case).

module type Show = sig
  type t
  val show : t -> string
end

let show (type a) (module S  : Show with type t = a) = S.show

module Show_int = struct
  type t = int
  let show = string_of_int
end

type 'a value = {value : 'a}

Now, you can define a functor saying : if I can show a value of type a then I can show a value of type a value:

module Show_value (S : Show) = struct
  type t = S.t value
  let show {value} = Printf.sprintf "{value = %s}" (S.show value)
end

and now:

let c = {value = {value = 1}}

(* you have to instantiate the functor with the right parameter *)
show (module Show_value (Show_value (Show_int))) c;;
- : string = "{value = {value = 1}}"

With modular implicit, the solution would be simpler but there is still a lot of work before having this in the language (as explain in this comment).

module type Show = sig type t val show : t -> string end

let show {S : Show} = S.show;;
val show : {S : Show} -> S.t -> string = <fun>

implicit module Show_int = struct type t = int let show = string_of_int end

implicit module Show_value {S : Show} = struct
  type t = S.t value
  let show {value} = Printf.sprintf "{value = %s}" (S.show value)
end

let c = {value = {value = 1}}

(* here with implicit, the module is inferred by the compiler *)
show c;;
- : string = "{value = {value = 1}}"

The desire for modular implicits seems to re-appear constantly.

Given that implicits won’t land before a long time, it’s best if this desire would be doused.
Please refer to this very zen video to find out more on that :wink:

3 Likes

Given that implicits won’t land before a long time,

I know. Don’t remind me. :cry:

could you please explain that generic code?
Where does this Poly_val come from and what is this line doing?

let () = Gfun.show_ext (Poly_val Any) { f = show_poly_val }

also I can not see how this will help to create a recursive polymorphic structure. For example

let x = p (Poly_val ???) {value = { value = 1 }}

Also I would be interested in seeing how to solve this using an OO approach

could you please explain that generic code?

In a simple implementation it’s based on extensible type definition

type _ ty += ..

We can extend this definition with a generalized constructor:

type 'a poly_val = { value : 'a }
type _ ty +=
  | Poly_val : 'a ty -> 'a poly_val ty

What’s generalized in this constructor unlike ordinary constructors e.g. Some is that the type parameter of the resulting type is determined by the constructor itself rather than solely by the type of the value passed when applying the constructor, i.e. when we apply the constructor Poly_val to a value of type 'a ty we won’t just get the value of type ('a ty) ty as with, say, Some (Int : int tySome Int : (int ty) option), but a value of type 'a poly_val ty (Int : int tyPoly_val Int : (int poly_val) ty, note the int poly_val, not int ty). This fact that constructor determines the type parameter allows a kind of sound “backward reasoning” when matching on this constructor. If we were matching on the value of type a ty (a is unknown) and matched on the constructor Poly_val, we now know that the value was actually of type b poly_val ty and can safely assume a = b poly_val (now b in unknown). This, in tun, can be exploited to dynamically determine (refine) the type of a value of a polymorphic type. Unlike situation with 'a option where we can never safely determine the actual type behind 'a, with a generalized constructor we can reveal the type by matching on a generalized constructor. This is what is done in show_poly_val:

let show_poly_val : type a. a ty -> a (* <-- the same 'a' as in 'a ty' *) -> string =
  fun t x (* <-- whenever 't' has type 'a ty', x has type 'a' *) ->
  match t with
  | Poly_val t            -> (* t : b poly_val ty ==>
                                a = b poly_val ==>
                                x : b poly_val, so we can do x.value *)
 (* ... *)

This already allows to do something like this:

type _ ty = ..
type 'a poly_val = { value : 'a }
type _ ty +=
  | Int : int ty
  | String : string ty
  | Poly_val : 'a ty -> 'a poly_val ty
type show = { f : 'a. 'a ty -> 'a -> string }
let funs = ref []
let register f = funs := f :: !funs
let show t x =
  let rec try_all =
    function
    | [] -> raise Not_found
    | { f } :: fs -> (try f t x with Not_found -> try_all fs)
  in
  try_all !funs

let show_int : type a. a ty -> a -> string = fun t x ->
  match t with
  | Int -> "int(" ^ string_of_int x ^ ")"
  | _   -> raise Not_found
let show_string : type a. a ty -> a -> string = fun t x ->
  match t with
  | String -> "string(" ^ x ^ ")"
  | _   -> raise Not_found
let show_poly_val : type a. a ty -> a -> string = fun t x ->
  match t with
  | Poly_val t -> "poly_val(value=" ^ show t x.value ^ ")"
  | _   -> raise Not_found
let () =
  register { f = show_int };
  register { f = show_string };
  register { f = show_poly_val }

let p t x = print_endline (show t x)
let () =
  p (Poly_val Int) { value = 5 };
  p (Poly_val String) { value = "line1" };
  p (Poly_val (Poly_val String)) { value = {value = "line2"} }

Where does this Poly_val come from and what is this line doing?

The library provides convenience and a more efficient implementation for this approach, in particular:

  • It provides efficient registration and lookup implementation (analogues of register and show above) that uses hashing on the outer constructor instead of simple consecutive attempts (O(1) lookup instead of O(number of registered show_* functions)). The show is Gfun.show and register is Gfun.show_ext that requires additional Poly_val Any argument for hashing.
  • It provides syntactic extension [@@@reify_all] to automatically generate the generic constructors such as type _ ty += Poly_val : 'a ty -> 'a poly_val ty for every type defined in the module
  • It provides default implementations for common types such as int, string, bool etc. and actually also generates (with [@@@reify_all]) generic descriptions that enable a default implementation to print the value of any type without explicitly implementing the show_* function so that even just
    open Generic
    open Ty.T
    [@@@reify_all]
    type 'a poly_val = { value : 'a }
    let p t x = print_endline (Gfun.show t x)
    let () =
      p (Poly_val Int) { value = 5 };
      p (Poly_val String) { value = "line1" };
      p (Poly_val (Poly_val String)) { value = {value = "line2"} };
      p (Poly_val (Poly_val Int)) { value = {value = 1} }
    
    will print
    {value = 5}
    {value = "line1"}
    {value = {value = "line2"}}
    {value = {value = 1}}
    
    without the need to manually implement anything.

also I can not see how this will help to create a recursive polymorphic structure

already used

  let x = p (Poly_val (Poly_val Int)) { value = {value = 1} }

If you mean an infinite recursive structure (works only with -rectypes compiler option), simple

p (let rec w = Poly_val w in w) (let rec v = { value = v } in v)

will give a stack overflow, but this can be supported e.g.

[@@@reify_all]
type 'a poly_val = { value : 'a }
type cycle = cycle poly_val
let show_cycle : type a. a ty -> a -> string = fun t x ->
  match t with
  | Cycle  -> "<cycle>"
  | _ -> assert false
let () = Gfun.show_ext Cycle { f = show_cycle }
let x  = p Cycle (let rec v = { value = v } in v)

Also I would be interested in seeing how to solve this using an OO approach

By OO approach I meant the usual way similar to Java’s Object#toString e.g.

type 'a poly_val = { value : 'a }
class type ['a] obj = object method v : 'a method to_string : string end
class mk_int v = object method v = v method to_string = string_of_int v end
class mk_string v : [string] obj = object method v = v method to_string = v end
class ['a] mk_poly_val v : ['a obj poly_val] obj =
  object method v = v method to_string = "value: " ^ v.value#to_string end
let int = new mk_int
let string = new mk_string
let poly_val = new mk_poly_val

let p x = print_endline x#to_string
let () =
  p (poly_val {value = int 5 });
  p (poly_val {value = string "line1" });
  p (poly_val { value = (poly_val { value = int 1 })})

In comparison with the functor or first-class module approach the generic programming one allows fully dynamic dispatch of the printers e.g.

let show_poly_val : type a. a ty -> a -> string = fun t x ->
  match t with
  | Poly_val (Poly_val Int) -> "(value = #" ^ string_of_int t x.value.value ^ ")"
  | Poly_val t            -> "(value = " ^ Gfun.show t x.value ^ ")"

will print { value = { value = 1 }} as just (value = #1) instead of (value = (value = 1)) while printing everything else as before without the need to construct different modules at such call sites. It also allows to implement other functions (not just printing) in a very generic way so that they can inspect (decompose and transform, not just pass around) the values of any type for which [@@@reify] provided the necessary description. But generic programming can be significantly slower due to dynamic lookup overhead (that also prevents compiler optimizations) and requires more initial effort to set up the necessary plumbing.

1 Like

Thanks for your extensive answer. It will take me some time to digest this.
One question thou: It seems to me that the generic approach is not type safe, Am I correct?
p (Poly_val (Poly_val bool)) { value = {value = true} } would crash as there is no registered show function

p (Poly_val (Poly_val bool)) { value = {value = true} }

In the library implementation it will actually work, because show for Bool is pre-registered. But with an unknown type witness in its place it will simpy print <value> (the dummy default). Anyway it won’t crash. In the sample implementation (with raise Not_found and without the library) it will raise Not_found.

This is also not violation of type safety as it won’t try to interpret the value of one type as the value of the other. Actually the whole point of introducing extensible GADTs into the language is largely about making it possible to dynamically inspect the types of values in a type-safe way and in OCaml all the code that doesn’t use unsafe features (Obj module or C bindings) is always typesafe. Type safety actually means ability to interpret behaviour of the program within the semantics of the language, without knowning the underlying implementation (say, how Bool is actually represented). Exceptions are part of OCaml semantics, so raising an exception is not a violation of type safety.

But the limitation of any approach involving dynamic dispatch is that the real implementation used for printing is not statically predictable. Actually the way that limitation is addressed in the library is that [@@@reify] (the pre-processor implementing it) always registers functions providing the capabilities to construct/deconstruct every declared type and adds a constructor to the _ ty type. Then a generic printing implementation based on the deconstruction functions is always available and if one never adds constructors to ty manually, the dummy <value> will never be printed. But still there’s no guarantee about whether the custom or default printing implementation will be used. With functors/first-class modules you always construct the implementation manually by applying functors and provide this implementation at every call site. With modular implicits approach the idea is that the compiler should ensure local non-ambiguity of the automatically-constructed implementation (i.e. there should be only one way to construct the implementation using the modules and functors in current scope or a compile-time error is issued).