Reducing duplicated code in recursive module definitions

Hej!

I am working on a library where I need to use recursive modules. Each of these modules defines a type t (possibly referencing other modules’ t’s). Since the type checker needs to have an explicit signature for each recursive module, I currently end up with a lot of duplicated code. For example in

module rec A : sig 
  type t = B.t
end = struct 
  type t = B.t 
end

and B : sig 
  type t = int
end = struct 
  type t = int
end 

I end up having to define the type explicitly in the signature and the struct. The above example is of course too simple, as for that I would not actually need recursive modules, it should just show a minimal example. Is there a way of importing the type definition in struct from the signature of the same module?
I have tried playing around with ppx_import but did not get very far.

Thanks for all inputs :slight_smile:

1 Like

If you can construct your modules such that they only contain type definitions, then the signature is all you need, and the compiler will infer the “implementation”:

module rec A : sig 
  type t = B.t
end = A
and B : sig 
  type t = int
end = B

I think the “canonical” post about this (at least, the spot I originally learned it from) is Jane Street Tech Blog - A trick: recursive modules from recursive signatures

1 Like

Thanks for the link!

That is indeed interesting, unfortunately my modules do contain functions as well.

IME, it’s pretty rare to have to define the functions that accompany some set of recursive types within the same module – judicious use of includes can almost always yield a final module (or three, whatever) that have the appropriate signature(s), without duplicative declarations.

It’s actually very common, and one example is mentioned in the Jane Street blog you linked to: when you want automatic deriving of functions.
The other common case I’m aware of is when you need functors (for instance a tree defined as a set of trees).
In fact most of the time, if you need recursive modules it’s because you don’t have only types; otherwise a recursive type definition would have worked as well.

My own suggestion to solve the problem of type duplication would be to use a C-like preprocessor, such as cppo.

2 Likes

Thanks @vlaviron for suggesting cppo. I think that might actually solve my problem.
I have an additional question now though, which is more for curiosity. If I extend the above example a bit:

module rec A : sig 
  type t = B.t
  val eval : B.t -> int 
end = struct 
  type t = B.t 
  let eval b = b.x 
end

and B : sig 
  type t = {
    x : int; 
    y: int; 
  }
end = struct 
  type t = {
    x : int; 
    y : int; 
  }
end 

it does not type check, saying that the record field x in b.x is unbound. If I explicitly mention the type like in

module rec A : sig 
  type t = B.t
  val eval : B.t -> int 
end = struct 
  type t = B.t 
  let eval (b : B.t) = b.x 
end

and B : sig 
  type t = {
    x : int; 
    y: int; 
  }
end = struct 
  type t = {
    x : int; 
    y : int; 
  }
end 

it works without a problem. However, I don’t fully understand why that is necessary, given that the signature of the eval function inside module A already specifies that the first argument must by of type B.t. I’m a bit curious why the type checker cannot figure this out without manual annotation.

I might have actually found the answer myself. If I use b.B.x instead it works.

When we write (M : S), I’m not sure that OCaml propagates type annotations from S to type-check M. Doing this would be a bit weird in the general case where several toplevel declarations of M have the same name; in this case only the last one would benefit from the annotation. Seemingly innocuous transformations (add let foo = foo after the last declaration of foo) would then break typing.