Type inference and recursive modules

A had a question about mutually recursive modules. It is similar to the example found here.

Check out these two modules. If have intentionally added bug in the to_b function where it uses the A.of_int rather than B.of_int.

module rec A : sig
  type t
  val of_int : int -> t
  val to_b : t -> B.t
end = struct
  type t = int
  let of_int x = x

  (* With the B.t type annotation, this won't compile.  If you remove it, it will compile. *)
  let to_b x : B.t = A.of_int x
end

and B : sig
  type t
  val of_int : int -> t
  val to_a : t -> A.t
end = struct
  type t = int
  let of_int x = x
  let to_a x : A.t = A.of_int x
end

Since t is abstract for both modules, I expected this let to_b x = A.of_int x to give an error since A.of_int returns A.t and not B.t. But the compile error only occurs when I define it with type annotation: let to_b x : B.t = A.of_int x.

I assume that the reason it compiles fine is the fact that modules A and B are defined recursively and since they have the same type t = int in both implementations, it just works out.

In this case, it doesn’t really matter if you were to remove the type annotation and let it compile, but you could imagine a case where the A.of_int and B.of_int had some special preconditions that it enforced…e.g., ensuring A types are all positive numbers and B types are all negative numbers. Depending on the specific logic, you could get a runtime error down the line somewhere.

Basically, I’m wondering if this is the expected behavior. From the many Discuss posts about recursive modules, they seem to be a bit tricky, and I suppose it may be best to avoid them if possible…

This is a manifestation that the typing of definition and the verification of signature constraints are done in successive phase in OCaml.

A minimal example would be:

module rec A: sig type t end = struct
  type t = int
end
and B : sig val x: A.t end = struct
  let x = 0
end

In this example, first we check the type of the definition of A and B, assuming that A:sig type t end and B: sig val x: A.t end. At this stage, we infer that

A: sig type t = int end
B: sig val x: int end

At this point, we check that those inferred types are a subtype of the signature constraints. This is visibly true for A since sig type t = int is a subtype of sig type t end. Maybe more surprisingly, this is still true for B , sig val x: int end is a subtype of sig val x: A.t under the hypothesis that A: sig type t = int end.

When you add the constraint x : A.t in the implementation of B

module rec A: sig type t end = struct
  type t = int
end
and B : sig val x: A.t end = struct
  let x : A.t = 0
end

it is the first step of typechecking that break: when we are inferring the type for B, we only have the hypothesis A:sig type t end (and B: sig val x: A.t end). Thus the line

let x : A.t = 0

cannot be deduced from our hypotheses, yielding a type error.

In other words, this means that mutually-recursive modules can use some type equality from their sibling modules that are not exposed by the external interface, if one are careful enough to not use those equality during type inference.

3 Likes

Thanks for the clear explanation and the examples. This cleared it up for me!!

@octachron is this feature a goal of recursive modules or just a side effect of the implementation which we will need to live with forever?

The difference of behavior between the first and second version is probably due to a lack of specification for recursive modules (outside of the common core).

However, accepting the first definition seems coherent. And I guess that a more principled specification and implementation of recursive model would strive to allow more definitions rather than the opposite.