First class module signature mismatch

I’m stuck… I’m experimenting with a cute way to abstract string/bytes/Buffer.t/bigstring/etc. reading/writing in a single interface, and the very last step which I thought would be nice, is to use a functor to “curry” the specific buffer being worked on in the first-class modules I return. It causes an interesting type error which I can’t quite figure out.

Proof of concept structure
module Buf = struct
  type data = Bytes of bytes | String of string

  type 'a t = { data : data; size : int; mutable pos : int; mutable len : int }

  let of_bytes ?(pos = 0) ?(len = 0) (s : bytes) =
    { data = Bytes s; size = 0; pos; len }

  let of_string ?(pos = 0) ?(len = 0) (s : string) =
    { data = String s; size = 0; pos; len }

  module type S = sig
    val block_size : unit -> int
  end

  module Make_bytes (I : sig
    val i : bytes t
  end) : S = struct
    let block_size () = I.(i.size)
  end

  module Make_string (I : sig
    val i : string t
  end) : S = struct
    let block_size () = I.(i.size)
  end

  let select b =
    match b.data with
    | Bytes d ->
        (module Make_bytes (struct
          let i = b
        end) : S)
    | String d ->
        (module Make_string (struct
          let i = b
        end) : S)
end
What a module using it could look like
module Some_codec = struct
  let decode ~src ~dst =
    let module In = (val Buf.select src) in
    let module Out = (val Buf.select dst) in
    In.block_size ()

  let test = decode ~src:(Buf.of_string "R") ~dst:(Buf.of_string "U")
end

In other words, Buf.select gives me the correct module depending on which type of data is wrapped by Buf.t. That way, a single function could read from and write to various types of buffers without having to provide functions or arguments for each possible type, and without having to specify the Buf.t argument. This scheme also avoids run-time matching on the Buf.data variant at every single I/O call.

The compiler isn’t fond of i however, my “curry”:

Signature mismatch: Modules do not match:
  sig val i : bytes t end
    is not included in
  sig val i : string t end
Values do not match:
  val i : bytes t
    is not included in
  val i : string t

The first module created in Buf.select “wins” and the compiler complains that the second one doesn’t match. The function itself is bytes t -> (module S) so I can’t see why it needs to specialize for one specific type. Am I over-engineering this idea to death or is this salvageable cleanly?:wink:

In select, you’d like b to have bytes t type in the Bytes case and string t type in the String case. The two types are incompatible, so this can’t work without a bit of magic, in this case GADTs.
My suggestion is to replace your type definitions with:

type _ data = Bytes : bytes -> bytes data | String : string -> string data

type 'a t = { data : 'a data; size : int; mutable pos : int; mutable len : int }

You’ll also need some type annotations on select (let select (type a) (b : a t) =), and maybe a few more things will need to be adapted, but this should get you at least a bit further.

3 Likes

Merci beaucoup @vlaviron ! :smiley: I had toyed with GADTs earlier in this experiment but removed them as “unnecessary”, which they were back then but that was before I tried to “curry” an instance with functors.

Edit: thanks to @anmonteiro I looked up locally abstract types and I’m on the right track to understanding why this works now.

1 Like

For what it’s worth, it turns out that GADTs were counter-productive to reach my ultimate goal. I could never “curry” with them, with yet another a b not included in 'a b kind of error in the use of signature I. A plain private variant turned out to work with my cleaned up structure. The relevant bits:

module Buf = struct
  (* types bigstring_buf, buffer_buf, bytes_buf, string_buf *)
  type t =
    | Bigstring_buf of bigstring_buf
    | Buffer_buf of buffer_buf
    | Bytes_buf of bytes_buf
    | String_buf of string_buf

  module type S = sig
    val length : unit -> int
  end

  module Make_bytes (I : sig
    val b : bytes_buf
  end) : S = struct
    let length () = Bytes.length I.b.data
  end
  (* Make_bigstring(), Make_buffer(), Make_string() ... *)

  let get_module bt =
    match bt with
    | Bytes_buf b -> (module Make_bytes (struct let b = b end) : S)
    (* ... *)

end

module Some_codec = struct
  let decode src dst =
    let module In = (val Buf.get_module src) in
    let module Out = (val Buf.get_module dst) in
    (* Use In and Out ... *)
    dst

  let test str = str |> Buf.of_string |> decode (Buf.new_buffer 4096)
  (* val test : string -> Buf.t *)
end

The end result is that not only can Some_codec abstract away the input and output buffer implementations, but there’s also no run-time cost in doing so after the initial calls to Buf.get_module.

1 Like