Confused about aliases of first class module arguments

I ran into the following example, and I was surprised to find that f' is accepted while f is not. Is this expected behavior? I’m guessing it’s related to module type inference on (module B)—but I thought that module types are structural, so it should be fine to pass a B to a function that expects an A.

module type A = sig
  val a : int
end

module type B = sig
  include A

  val b : int
end

let g (module A : A) = A.a

let f ((module B : B) as m) = g m

let f' (module B : B) = g (module B)
2 Likes

Not really, it’s a problem of subtyping. In f the value m belongs to the core language with type (module B), but there is no built-in subtyping relation in the core language: g expects a value of type (module A) that is not a supertype of (module B). Contrarily, in f' the value B belongs to the module language and the module type B is a subtype of A, so the built-in subtyping relation of the module language applies.

You could also have defined f this way:

let f (m : (module B)) = g m (* don't type check since g : (module A) -> int *)

and if you want it to type check, you should write:

let f (m : (module B)) = g (module (val m))
4 Likes

As a bit of an aside, there’s a good reason for not providing structural subtyping of first-class modules: their run-time representation (effectively a record of values) depends on the order in which the values appear in the module type. For example:

(* test_order.ml *)
module type A_then_B = sig val a : int val b : int end
module type B_then_A = sig val b : int val a : int end
module X = struct let a = 1000 let b = 2000 end

let x  = (module X : A_then_B)
let x' = (module X : B_then_A)
$ ocamlc -dlambda test_order.ml
(setglobal Test_order!
  (let
    (T/88 =
       (module-defn(T/88) Test_order test_order.ml(3):110-157
         (let (a/86 =[int] 1000 b/87 =[int] 2000) (makeblock 0 a/86 b/87)))
     x'/94 = (makeblock 0 (field 1 T/88) (field 0 T/88)))
    (makeblock 0 T/88 T/88 x'/94)))

Our Test_order module gets compiled down to a block with three fields: T, x and x'. We see that x is represented as exactly T (with no computation at runtime), but evaluating x' requires making a new heap block in which the fields have been swapped. (Something similar happens when running under ocamlopt, but with a bit more noise.)

In your example, the runtime implementation of f' depends on the order of fields in B:

-f'          = (function m : int (apply g (makeblock 0 (field 0 m))))
+f'_reversed = (function m : int (apply g (makeblock 0 (field 1 m)))))

Thus, f' can’t in general be implemented as exactly g (and the inference to check whether it can would require some notion of row polymorphism, and not just plain structural subtyping). The type-checker thus requires going via the module language (as @kantian describes), with the corresponding run-time operations that this implies.

4 Likes

I had no idea that first class module unpacking did something at runtime. Thanks a lot @CraigFe !

2 Likes

I’m not sure that row polymorphism is necessary for this, they could have transfer the structural subtyping of modules to first-class modules, following this rule: (module A) <: (module B) iff A <: B. But It’s more code to write and maintain in the compiler, and I don’t think it worth it: it’s easy for a user to retrieve this subtyping relation by unpacking the first class module.

@emillon : this is not the unpacking that did something at runtime, but the coercion on module. It’s the same thing with normal functors when there is a module type coercion.

module type A = sig val a : int end
module type B = sig include A val b : int end

module A : A = struct let a = 1 end
module B : B = struct let a = 1 let b = 2 end
(* the definition of C implies a block creation at runtime *)
module C = (B : A)

Basically, modules and records have exactly the same runtime representation (that’s why there is no computation at runtime when defining x in @CraigFe code). And the OP example is just similar to this one:

type a = { a : int }
type b = { a : int; b : int }

let g {a} = a

(* the subtyping coercion function *)
let b_to_a {a ; b } = { a }

(* doesn't type check *)
let f ({a; b} as m) = g m

(* type check *)
let f' ({a ; b} as m) = g (b_to_a m)

The difference is that with first-class module the coercion function is implicitly applied.

My point here was about using the type system to track whether a particular first-class module coercion could have no run-time cost given the runtime representation (i.e. whether f' = g), and this is order-dependent. Naturally if one allows the subtyping coercion <: to perform some transformation at run-time then it’s possible to provide structural subtyping here (or any number of fancy subtyping relations :slight_smile:).

To me at least, such a feature would be jarring: I’d rather the type-system not paper over potentially-expensive runtime conversions in the core language. (I think coercions in the module language get a pass here because they’re typically static or happen once at initialisation time.) I’d be interested to know if any of the existing subtyping relations in OCaml have a runtime implementation.

I’m not sure I get your point. When you say:

I think that’s what we have with first-class module: coercion in the core language that implies some runtime conversions (because that’s what we have with the module language).

For sure we can’t write this:

let a_modB_is_a_modA m = (m : (module B) :> (module A));;
Error: Type (module B) is not a subtype of (module A)

because there is no such built-in subtyping relation, but morally this is exactly what is stating this function:

let a_modB_is_a_modA (module B : B) = (module B : A);;
val a_modB_is_a_modA : (module B) -> (module A) = <fun>

And in most (if not all) use cases of first-class functor, such coercion with runtime conversion is what’s happen, because that’s what will do these functors:

(* functor versions of the OP examples *)
module G (A : A) = struct let res = A.a end

(* the runtime creates a block with the shapes of module type A
   in order to apply the functor G to the input B *)
module F (B : B) = G (B)

I personnally think that the fact that :> is a type-level only operator without any effect on the runtime values is a meaningful property. In particular, without this property, you break separate compilation since you cannot implement coercion for an abstract type constructor with a covariant or contravariant type parameter (because x: 'a t :> 'b t now requires to know the definition of t) .

4 Likes

Good point, I hadn’t thought to this issue and it’s a very good reason to keep :> a type-level operator and a no-op at runtime. I think I get @CraigFe point now.
And, if we really want such coercion, it’s not really more verbose to write:

let conv m = (m : (module B) :> (module A))

(* compare to *)
let conv m = (module (val m : B) : A)
2 Likes

Well in that alternative world I would expect the module defining t to implicitly export a coercion function.

In our actual world, this coercion function is usually explicitly exported and named map for covariant functor. But, I guess it’s better to let the user choose how to apply it for efficiency reason. For instance, with the OP example and the list functor, we could want to do:

let foo (l : (module B) list) = List.map g (l :> (module A) list)

which can be correctly (i.e it type checks) translated to:

let coerce (m : (module B)) = (module (val m) : A)

let foo (l : (module B) list) = List.map (fun m -> g (coerce m)) l

(* or this way if we coerce the list first, but it's less efficient *)
let foo (l : (module B) list) = l |> List.map coerce |> List.map g