Yes, it is. I think that this is a (poor) name choice that confuses you. You assume, that Foo_intf
has actually something to do with the Foo
module, hence the name. In fact, the idea is that you define your abstractions in the module Foo_intf
and then your Foo
and other modules depend on abstractions, rather than on the implementations like Foo
. I, myself, rarely, if ever, use the foo_intf.ml
naming scheme. Usually, I tell myself that if I can’t give a name to abstraction, then it is probably a bad abstraction to start with. I usually define some number of module types in a file called library_types.ml
, e.g., compiler_types.ml
. And then refer to those abstractions where necessary. Note also, that any module type in OCaml acts as a generator for a family of module types, e.g., if you have a module type
module type S = sig
type t
val init : t
val succ : t -> t
end
Then you can use it to create module types for concrete types, e.g., S with type t = int
and S with type t = expr
, etc. Therefore, your module types in compiler_types.ml
should be as free from constraints as possible/reasonable.
Concerning your style (I looked into act), it looks like it is heavily influenced by Haskell, where you define type classes, make your implementation dependent on those type classes, and then instantiate a solution with a concrete selection of implementation types.
Before applying the same approach in OCaml you shall consider a couple of differences between those two languages and correct your approach accordingly.
- OCaml provides a stronger and more powerful module system than Haskell
- OCaml functors are more expressive, but less mechanized than Haskell type classes, therefore they usually employ more cognition burden
The first point, is that where in Haskell you have only type classes to protect your abstractions, in OCaml you have module types, with sharing constraints and strengthening. Modules with abstract (opaque) types provide sufficient enough protection, so in most cases it is fine to depend on a concrete module rather on an abstraction that this module implements, e.g., consider the following two approaches:
module type Var = sig (* ... *) end
module type Exp = sig (* ... *) end
module type S = sig
type exp
val run : exp -> exp
end
module Optimizer(V : Var)(E : Exp with type var = Var.t) :
S with type exp = Exp.t
which basically mimics the Haskell style, where you have two type classes (Var and Exp) and a generic function run_optimizer
defined in the context of those two type classes. And finally, you have a particular instantiation of your framework with concrete instances of type classes,
module Exp = Non_hashconsed_exp(String)
module Optimizer = Optimizer(String)(Exp)
let main input =
Exp.deserialize input |>
Optimizer.run |>
Exp.serialize
This is a perfectly fine solution, where you try to be as generic as possible, so that your code will become robust to the future changes. And I’m not advising against this style, except that the more abstractions you introduce, the more indirections you have, the more delayed choices you make, the higher is the cognition burden of your framework, which at the end of the day contributes to its maintainability, testability, and usability. So you have apply the Occam razor principle and use the least heavy method when you build your system and call for heavy artillery only if and when needed. Going back to our example, it is perfectly fine to implement the Optimizer
module referring directly to Var
and Exp
modules, especially since we don’t have (and probably do not plan in the near future to have many different implementations of those). Keep in mind though, that when you write a function val optimize : Exp.t -> Exp.t
you’re actually introducing a dependency to an abstract type Exp.t
not to a concrete implementation, so you are protected from the technical dept of the poor choices made in the exp.ml
implementation by the exp.mli
interface. Therefore, you shall design the exp.mli
interface very carefully, basically, you shall try to find the strongest possible theory, that is sufficient enough to implement the optimize
function, without leaking any details. Therefore, if later you will decide to try another representation, you can generalize your Optimize
module and make it a functor and go back to the functorized solution once it is really needed. You can even make it backward compatible, i.e., without breaking the interfaces. E.g., it starts as
(* file optimizer.mli *)
val run : Exp.t -> Exp.t
which is later generalized to
(* file optimizer_types.ml *)
module type Exp = sig ... end
module type Var = sig ... end
module type S = sig
type exp
val run : exp -> exp
end
and
(* file optimizer *)
open Optimizer_types
module Make(E: Exp) : S with type t = E.t
(* and the default implementation, using concrete `Exp.t` *)
include S with type exp = Exp.t
where the optimizer is usually generalized by just adding module Make(Exp : Exp) = struct ... end
around the old function, e.g.,
(* file optimizer.ml *)
open Optimizer_types
module Make(Exp : Exp) = struct
let rec run input =
Exp.analyze input
~case_add:(fun x y -> ...)
...
end
include Make(Exp)
To summarize, do not afraid to depend on modules, as long as your modules have sufficient mli files. A good indicator that you’re using a functor where you can just depend on a module is when you have lots of sharing constraints referring to concrete types in your mli files.
One final note, do not afraid to duplicate signatures, as duplicating signatures (even via copy pasting) is very different from code duplication. The main reason why the code duplication is conceived as a bad practice is because it duplicates errors, and once you fixed an error in one place it will still persist in the place where it was duplicated. However, when you duplicate your a signature, it is not a code, since it doesn’t have any runtime semantics. In other words it can’t go wrong. Moreover, whenever you will update your signatures, the compiler will automatically verify that all it duplicates references are still consistent, so that you can fix/update them.
Some may say, that duplication of interfaces duplicates the amount of reasoning about the code, since the reader might now need to read the same types twice. It is correct to the certain degree. However, indirection also increases cognition burden, and you know this by reading Janestreet’s interfaces where you will find lots of annoying include Foo_intf.S
where foo_intf.ml
itself includes other interfaces and so on, until you lost what you were looking for. So probably having all the interfaces here, at your hands, inlined is better.
I, myself, usually leverage this approach to two or more set of duplicated interfaces. For example, when I define a library, I have a set of modules each having the so called internal interface. And an umbrella module, which publishes a subset of those modules and an interface which is itself a subset of their union. And this interface I call the public interface. Here is a concrete example, which in fact involved lots of module types and functors. (Note it is a work in progress, so it lacks documentation). For more finished projects, consider the Bap Primus framework or the Monads Transformer Library. All those projects involve a substantial amount of signature duplication, e.g., all interfaces in monads_types.ml
are repeated in monads.mli
.