I ran into the following example, and I was surprised to find that f' is accepted while f is not. Is this expected behavior? I’m guessing it’s related to module type inference on (module B)—but I thought that module types are structural, so it should be fine to pass a B to a function that expects an A.
module type A = sig
val a : int
end
module type B = sig
include A
val b : int
end
let g (module A : A) = A.a
let f ((module B : B) as m) = g m
let f' (module B : B) = g (module B)
Not really, it’s a problem of subtyping. In f the value m belongs to the core language with type (module B), but there is no built-in subtyping relation in the core language: g expects a value of type (module A) that is not a supertype of (module B). Contrarily, in f' the value B belongs to the module language and the module type B is a subtype of A, so the built-in subtyping relation of the module language applies.
You could also have defined f this way:
let f (m : (module B)) = g m (* don't type check since g : (module A) -> int *)
and if you want it to type check, you should write:
As a bit of an aside, there’s a good reason for not providing structural subtyping of first-class modules: their run-time representation (effectively a record of values) depends on the order in which the values appear in the module type. For example:
(* test_order.ml *)
module type A_then_B = sig val a : int val b : int end
module type B_then_A = sig val b : int val a : int end
module X = struct let a = 1000 let b = 2000 end
let x = (module X : A_then_B)
let x' = (module X : B_then_A)
Our Test_order module gets compiled down to a block with three fields: T, x and x'. We see that x is represented as exactly T (with no computation at runtime), but evaluating x' requires making a new heap block in which the fields have been swapped. (Something similar happens when running under ocamlopt, but with a bit more noise.)
In your example, the runtime implementation of f' depends on the order of fields in B:
-f' = (function m : int (apply g (makeblock 0 (field 0 m))))
+f'_reversed = (function m : int (apply g (makeblock 0 (field 1 m)))))
Thus, f' can’t in general be implemented as exactly g (and the inference to check whether it can would require some notion of row polymorphism, and not just plain structural subtyping). The type-checker thus requires going via the module language (as @kantian describes), with the corresponding run-time operations that this implies.
I’m not sure that row polymorphism is necessary for this, they could have transfer the structural subtyping of modules to first-class modules, following this rule: (module A) <: (module B) iff A <: B. But It’s more code to write and maintain in the compiler, and I don’t think it worth it: it’s easy for a user to retrieve this subtyping relation by unpacking the first class module.
@emillon : this is not the unpacking that did something at runtime, but the coercion on module. It’s the same thing with normal functors when there is a module type coercion.
module type A = sig val a : int end
module type B = sig include A val b : int end
module A : A = struct let a = 1 end
module B : B = struct let a = 1 let b = 2 end
(* the definition of C implies a block creation at runtime *)
module C = (B : A)
Basically, modules and records have exactly the same runtime representation (that’s why there is no computation at runtime when defining x in @CraigFe code). And the OP example is just similar to this one:
type a = { a : int }
type b = { a : int; b : int }
let g {a} = a
(* the subtyping coercion function *)
let b_to_a {a ; b } = { a }
(* doesn't type check *)
let f ({a; b} as m) = g m
(* type check *)
let f' ({a ; b} as m) = g (b_to_a m)
The difference is that with first-class module the coercion function is implicitly applied.
My point here was about using the type system to track whether a particular first-class module coercion could have no run-time cost given the runtime representation (i.e. whether f' = g), and this is order-dependent. Naturally if one allows the subtyping coercion <: to perform some transformation at run-time then it’s possible to provide structural subtyping here (or any number of fancy subtyping relations ).
To me at least, such a feature would be jarring: I’d rather the type-system not paper over potentially-expensive runtime conversions in the core language. (I think coercions in the module language get a pass here because they’re typically static or happen once at initialisation time.) I’d be interested to know if any of the existing subtyping relations in OCaml have a runtime implementation.
I think that’s what we have with first-class module: coercion in the core language that implies some runtime conversions (because that’s what we have with the module language).
For sure we can’t write this:
let a_modB_is_a_modA m = (m : (module B) :> (module A));;
Error: Type (module B) is not a subtype of (module A)
because there is no such built-in subtyping relation, but morally this is exactly what is stating this function:
let a_modB_is_a_modA (module B : B) = (module B : A);;
val a_modB_is_a_modA : (module B) -> (module A) = <fun>
And in most (if not all) use cases of first-class functor, such coercion with runtime conversion is what’s happen, because that’s what will do these functors:
(* functor versions of the OP examples *)
module G (A : A) = struct let res = A.a end
(* the runtime creates a block with the shapes of module type A
in order to apply the functor G to the input B *)
module F (B : B) = G (B)
I personnally think that the fact that :> is a type-level only operator without any effect on the runtime values is a meaningful property. In particular, without this property, you break separate compilation since you cannot implement coercion for an abstract type constructor with a covariant or contravariant type parameter (because x: 'a t :> 'b t now requires to know the definition of t) .
Good point, I hadn’t thought to this issue and it’s a very good reason to keep :> a type-level operator and a no-op at runtime. I think I get @CraigFe point now.
And, if we really want such coercion, it’s not really more verbose to write:
let conv m = (m : (module B) :> (module A))
(* compare to *)
let conv m = (module (val m : B) : A)
In our actual world, this coercion function is usually explicitly exported and named map for covariant functor. But, I guess it’s better to let the user choose how to apply it for efficiency reason. For instance, with the OP example and the list functor, we could want to do:
let foo (l : (module B) list) = List.map g (l :> (module A) list)
which can be correctly (i.e it type checks) translated to:
let coerce (m : (module B)) = (module (val m) : A)
let foo (l : (module B) list) = List.map (fun m -> g (coerce m)) l
(* or this way if we coerce the list first, but it's less efficient *)
let foo (l : (module B) list) = l |> List.map coerce |> List.map g