Inferring (module) type of codomain of functor (like "ocamlc -i")

I have a little problem with typing of OCaml modules, and I fear that the answer is
“can’t get there from here”. But I’m not sure, so I figured I’d ask and see if
there’s something I’m missing.

In the following, each of the modules L, PA1, PA2 have many types and
values/entrypoints. Many, many. So maintaining the module-types of
these modules would be … tiresome. The reason I want to convert
these modules into functors, is that their implementation is partially
imperative, with references among the values.

The problem: Suppose I have three modules, L, PA1, and PA2. They’re written
as plain old files, and have various types and values. Each depends on the
previous ones in the sequence. I want to convert these into functors, so
that I can have multiple instances of each. So I’d have a functor PA1
that took an instantiated module L as argument, and then a functor PA2 that
took instantiated L and PA1 as arguments. Something like the following.

In the case where these weren’t functors, I didn’t need to write out MLI files.
But now that I have functors, I need to write out module-types for L(), PA1(),
PA2(). I’ve done so below, but it seems to me that I ought to be able to
infer those types (PA1TY, PA2TY)

It feels like there ought to be some way to use “module type of” and
destructive module substitution to do the trick, but it isn’t obvious to me.

Thank you for any advice.

module type LTY = sig
  type t
  val f : t -> t
end

module L() : LTY = struct
  type t = A | B
  let f x = x
end

module PA (L : LTY) = struct
  module L = L
  let f = L.f
end

module type PATY = sig
  module L : LTY
  val f : L.t -> L.t
end

module type PBTY = sig
  module L: LTY
  module PA : PATY with module L = L
  val f : L.t -> L.t
end

module PB(L : LTY)(PA : PATY with module L = L) : (PBTY with module L = L and module PA = PA) = struct
  module L = L
  module PA = PA
  let f = PA.f
end

(* using these functors *)
module L1 = L()
module L2 = L()

module PA1 = PA(L1)
module PA2 = PA(L2)

module PB1 = PB(L1)(PA1)
module PB2 = PB(L2)(PA2)

In this case it would help to more about the expected usage patterns for the generative functors.

If it’s always exactly this pattern you could have PA, PB, an L all be generative functors (i.e., with the () argument) each instantiating the next one in the chain. This eliminates the need to define module types for any functor arguments.

module L () = struct … end
module PA () = struct
  module L = L ()
  …
end
module PB () = struct
  module PA = PA ()
  module L = PA.L
  …
end

You should then be able construct PB instances and project components out as needed

module PB1 = PB ()
module PB2 = PB ()
module PA1 = PB1.PA
module PA2 = PB2.PA
module L1 = PB1.L (* or [PA1.L] *)
module L2 = PB2.L (* or [PA2.L] *)

Thank you for your response. Some more info.

I’ll start with the original code. This is all for an extensible grammar system (Camlp5). So:

  • there is the definition of the base of the grammar – its mutable state: call that L. You could call that a grammar-interpreter.
  • and then there are multiple different syntaxes that can be loaded
  • So one might be for the original syntax: this is split into two files, “pa_o” and “pa_op”
  • or the revised syntax: this again is split into two files, “pa_r” and “pa_rp”
  • It is relevant that after loading whichever syntax you want, you can (and people do) load further additional syntax rules.

So if you load “l.cmo”, “pa_o.cmo”, “pa_op.cmo”, that’s one syntax. Then you might load something further.

Or you might loaded “l.cmo”, “pa_r.cmo”, “pa_rp.cmo”.

In either case, there are other extensions that you can load, e.g. for writing the grammars themselves, for extensible functions, for extensible printers. All of these define syntax extensions.

Now what I want to do, is to functorize everything, so I can have multiple grammar-interpreters instantiated simultaneously, each with a different syntax loaded. When we compile these files in the non-functorized setup, ocamlc infers a signature for each file, and subsequent files can consult that signature, use entry points from the previous file.

But to do that when we functorize everything, we need to declare the signature of functor pa_r’s codomain, so we can use it to constrain the argument to functor pa_rp.

I can hack together something that instantiates the functor, compiles it with “-i”), hacks on the output, and rebuilds the signature, but … that would be ugly, and I’m hoping that there’s some elegant way to do it. I mean, if this weren’t functorized, it would be … zero effort, since it’s just “the way Ocaml works when there are no MLI files”.

Would this work for you?

module L () = struct
  type t = A | B
  let f x = x
end

module PA (L : module type of L ()) = struct
  type t = L.t
  let f = L.f
end

module PB (L : module type of L ()) (PA : module type of PA (L)) = struct
  let f = PA.f
end

(* using *)
module L1 = L ()
module L2 = L ()
module PA1 = PA (L1)
module PA2 = PA (L2)
module PB1 = PB (L1) (PA1)
module PB2 = PB (L2) (PA2)

(* module PB_error = PB (L1) (PA2) *)

oh wow, sweeeet! Thank you! I had never thought of putting nontrivial module-exprs in “module type of”! Would never have, neither!