What is the reason of separation of module implementation and signatures in OCaml?

In the ML module system, modules represent abstract data types with existential types, as shown in the foundational work by Mitchel and Plotkin. Compare with conventional languages, such Java or C++, where abstract data types are (poorly) modeled with classes (and interfaces) that bind together the nominal abstraction with the set of methods (operations). The ML module system does not invent any ad-hoc constructs, such as classes, but relies on mathematics to deliver proper definitions that are well-tested by time. In the ML module system, structures denote mathematical objects, and their types are denoted by the signatures.

The separation between abstraction and implementation is the essential part of modular programming in particular and reasoning in general. Properly chosen abstractions reduce the amount of information that we need to reason about and allow us to build complex systems from smaller parts. One of the responsibilities of the modular system in programming languages is to protect the abstractions by ensuring that modules depend on abstractions, not implementations. Consider Python, Common Lisp, and many other dynamically typed languages that do not protect the abstractions as they do not provide mandatory information hiding mechanisms. As a project evolve, the diffusion process rots through the module binaries, that essentially leads to projects that are hard to maintain and hard to understand.

Of course, the ML module system is not the only mechanism for implementing abstract data types. We have also classes and interfaces (as in Java,C++), another option is to use type classes as in Haskell (they all basically differ in the way how the represent polymorphism - that’s a completely different topic). But in any case, just having types of definitions, without providing a mechanism to define types of mathematical structures (i.e., sets of operations) is not enough.

Whether or not to have a separate mli file for signatures that is a design question. I personally like it, though it poses some technical problems and doesn’t play well with namespaces. In OCaml, you can consider mli files as a shorthand and even consider them optional. Some projects (e.g., ocamlbuild) define all their abstractions (module types) in one ml file, that is used then, in different implementations. Although that’s not common today, it’s a viable option.

9 Likes