A module signature can hide details of the implementation, in particular it can hide the details of a type such that a client can only use what is revealed in the interface. Below is a fifo stack that uses internally two lists but this is not revealed to clients. A simpler implementation could use just one list. The implementation could also contain additional functions and values that can be hidden to a client (which is not the case here).
(* fifo.mli *)
type 'a t
val empty: 'a t
val push : 'a -> 'a t -> 'a t
val pop : 'a t -> 'a t
val peek : 'a t -> 'a option
(* fifo.ml *)
type 'a t = 'a list * 'a list
let empty = (, )
let push y = function
|  ,  -> [y], 
|  , ys -> assert false
| xs , ys -> xs, y::ys
let pop = function
|  ,  -> failwith "fifo is empty"
| [x] , ys -> List.rev ys, 
| x::xs , ys -> xs , ys
|  , ys -> assert false
let peek = function
| x::xs , _ -> Some x
|  ,  -> None
|  , _ -> assert false
for such case we can use private/public declarations or naming conventions.
P.S. I’m not against that language design. I just want to understand the reason of such implementation cause I have some interesting (at least for me) ideas that I want to implement in my own ML-inspired language. So if there is no strong reason of such decision - I’ll try to implement signatures in other way (I’ll implement explicit typing with no type inheritance which also should speed up compiler).
Do you want to keep the concept of an interface such that you can decide whether an implementation matches an interface? If so, I think this is a good reason to have an explicit syntax for interfaces (sig … end) of which interface files (*.mli) are a special case.
I already argued about this elsewhere in this website.
Ideally one would have had only ml files, and any wishes the programmer had about how a value in a module should appear outside the module, would be specified in the ml file directly, around the definition of the value.
But that would have made the language harder to design and specify, I guess. Which is why OCaml retains this C-like cumbersome duplication of information into header and implementation files.
Indeed, and I thank you for linking to a post where I completely
disagree with everything you wrote. Let me answer to it :
“I makes it easier to orient oneself in large code bases. A single file I can peruse will indicate me the exact piece of functionality the module is exposing to the rest of the code base without having to crawl through the private parts of the module.”
On the contrary, the larger the module, the more incomplete and the less useful the mli is and in most cases you’re better off going to the ml directly.
“encourages interface design and thinking”
No, it only encourages writing a shiny mli and does nothing to promote clean, well-organized code in the ml where it is most needed.
To avoid the punishment of having to update everything twice, programmers cannot develop the interface and implementation concurrently and are forced into always writing the interface first, which is only merely one of several possible design paradigms.
Abstracting first is not always right. Abstraction is a trade-off like everything else. Think of leaky abstractions.
An implementation (*.ml) is not forced to have an *.mli file but if you want to hide certain aspects, you would have to do that on the level of sub-modules.
(* fifo.ml *)
module T: sig
type 'a t
end = struct
type 'a t = 'a list * 'a list
But I understand that you would prefer annotations on value definitions that control visibility and only resort to signatures when this is not enough. (I don’t consider maintaining interface files much of a burden and consider them a good mechanism and place for documentation.)
There’s no such thing as an incompletemli file. The .mli file precisely tells you what the module exposes to the rest of the code base. The rest if off limits thanks to the hiding property of interfaces and this is precisely what makes modular understanding of a code base easier.
Neither does not having .mli files ;-). At least you get clean interfaces and a clear summary of the entry points to the module.
You are not forced to write the interface first (you can also not write it at all initially). These things tend to be developed and refined in conjunction.
I can argue the exact opposite: the larger the implementation, the more useful the interface file becomes to skim over the irrelevant implementation details and only bother with understanding the interface i.e. how to use the module.
Not really. You can think about the interface and the implementation separately–in fact that’s the whole point. Whether you write a well-organised implementation or not is up to you; the mli just encourages writing a well-organised interface to make things easier for your users.
Sure, and you definitely don’t have to abstract first, in fact a lot of people don’t in OCaml. Write the implementation first, then worry about the interface.
Side note: @egoholic, some people ask the related question, ‘Why do I need two separate files for the module interface and implementation?’ In case you were wondering, you don’t actually need two separate files, if you want a single file you can use the include trick:
(* id.ml *)
type t = int
let make int = int
let toInt t = t
val make: int -> t
val toInt: t -> int
This defines and constrains the module to a signature in a single file.
If you’re doing API-driven development, it would be nice to be able to automatically generate an .ml file with stub functions that satisfies the interface. A blog at Jane Street has a way to work around this but I think an editor support is nice to have.
I think .mli files could (and should), if present, be used as hints to do the .ml. Just like in Java where you say a class is implementing an interface, then the IDE would tell you that you’re missing an implementation of such-and-such methods.
I tried to talk more than once to @let-def in making merlin on C-c C-l switch from implementation to interface and vice-versa like C-c C-a does at the file level when the cursor is on the identifier of a val or let rather than stop with the unhelpful “Already at definition point” but I don’t remember exactly the outcomes of the discussion. Maybe the problem is not as trivial as one can think when you factor in the full power of the language (or maybe he thinks it’s a bad idea).
No one has mentioned what was an “original” reason, so the history behind this might be with only a few people. @xavierleroy?
I can only suspect that it was done for clarity. Given one can specify the parameter types in the .ml file and you don’t need an mli at all. But reading through the implementation code whilst searching for a signature doesn’t seem efficient; mli files are nicer indexes.
And if you were to abstract or obscure the implementation it makes sense to produce a simple file of signatures to escort with your library. Note that you can initially auto-generate your mli and the trim it down.
See A History of OCaml. OCaml has its origin in Caml, Caml Light, and Caml Special Light as new implementations of ML – originally as a bytecode interpreter. Standard ML has signatures but at least Standard ML of New Jersey is an image system without separate compilation whereas Caml Light had files mapped to top-level modules. This raised the question how to implement signatures for those and I believe interface files were a natural answer. In particular, interface files shield client modules from re-compilation if only the implementation changes. I believe this is not true in the case of native compilation, though.
Addendum: as pointed out below by @dbuenzli, the -opaque option can be used to make natively compiled modules only depend on interfaces for faster re-compilation at the cost of reduced code quality.
@kantian good point, I wasn’t fully aware of what you just stated. But I think that you can still have separate compilation without mli files, because the Ocaml compiler knows to create a cmi file from the ml file when there’s no mli file present.