Future of `.mli` only module setups?

compiler

#1

Hi all !

for a long time, the Coq project has been using the OCaml feature that allows to use “interface-only” modules, that is to say, modules defined by a single mli file which don’t provide the corresponding .ml implementation.

However, we are aware of some limitations wrt to modules without an implementation, such as
https://caml.inria.fr/mantis/view.php?id=6923 , and we have heard of some other issues as for instance with packing.

So the the Coq developers have been wondering for a while if we should migrate away from the .mli only setup? Are there plans in OCaml to support .mli-only modules as first class citizens in the future?

Thanks for any help! Cheers, E.

edit: Link to Coq’s “intf” files: https://github.com/coq/coq/tree/v8.6/intf


#2

Is there a reason to not instead use single .ml files without .mli files?


#3

Our understanding is that this setup is indeed discouraged, due to problems in .cmi file generation, but we don’t really know. (Personally, I think that given this case we would just really ship a normal .ml/.mli module combo for the current files in intf)


#4

Jbuilder also warns against using mli only modules, with the following issue discussing various downsides of using them: https://github.com/janestreet/jbuilder/issues/9#issuecomment-283031113

That might be helpful in making your decision. In MirageOS, we’ve been migrating all our mli-only interfaces (e.g. in mirage-types) to ml versions without a corresponding mli file, and there have been no downsides observed so far. There are a few more dummy compilation units on the command line, but this has no appreciable slowdown.


#5

I have been using pure-mli files myself, and I find them particularly useful when the types and signatures are still in flux. After issues with build systems and learning about the limitations, I’ve used copy rules to duplicate the .mli file to .ml instead. For ocamlbuild projects I’ve used a generic rule

rule "%.mli & %.idem -> %.ml"
  ~deps:["%.mli"; "%.idem"] ~prod:"%.ml"
  (fun env build -> cp (env "%.mli") (env "%.ml"))

along with empty .idem files to limit the scope. For jbuilder I currently add a copy rules per-file, but hopefully there will be rule-schemas in the future.


#6

Oh, sorry to go off-topic, but how does the .idem files work? I’ve never encountered this before.


#7

I have been using pure-mli files myself, and I find them particularly useful when the types and signatures are still in flux.

Indeed. However, rather than creating ocamlbuild/jbuilder rules, I have found this approach - https://blog.janestreet.com/simple-top-down-development-in-ocaml/ - quite helpful for similar reasons/use-cases.


#8

Personally I am against the ml/mli organization in OCaml has been copied from the .h/.c organization in C. C has been criticized for this duplication of information, and the same criticism applies in part to OCaml.
I think OCaml would have been better if its syntax had integrated some way to directly tell in the ml : I want this value to be visible or not in the signature (pretty similar to declaring methods to be private or public in OO).
It is, of course, not true that in programming you should do 100% of the interface first then proceed with the implementation. In real life, you have an alternance of interface and implementation.


#9

Specifying which values are private or public is not enough, in some cases, you may have a very generic implementation of a function, but you want to restrain its type in the interface (for instance when using polymorphic variants or GADTs). Since in these cases, you often need to annotate the function with its actual type, this could become quite heavy syntactically since you’d need to write both the real and the exported type.


#10

I don’t mind the .ml/mli separation, but I’ll note to @zozozo that duplicating the declaration in the .mli file is no better than sometimes duplicating it in the .ml file and specifying “public” or whatever. Though I suspect at this point the language is far past the point where such a change is feasible.


#11

(Oh, and as a data point, Niklaus Wirth abandoned the separate module interface files of Modula-2 in his design for Oberon, saying they hadn’t been a success. But again, this is purely of academic interest.)


#12

It has some advantages in my opinion. I makes it easier to orient oneself in large code bases. A single file I can peruse will indicate me the exact piece of functionality the module is exposing to the rest of the code base without having to crawl through the private parts of the module. .mli files reveal the program design without having to resort to higher-level IDE functionality and encourages interface design and thinking. It’s also a good spot to hold the documentation everyone loves to write.


#13

I tend to agree, though my experience in doing this in OCaml is fairly recent. (It is certainly useful to examine, say, .h files in a large C codebase.) Mostly I was noting that the amount of effort involved seems similar.


#14

A single file I can peruse will indicate me the exact piece of functionality the module is exposing to the rest of the code base without having to crawl through the private parts of the module

Yep.

For ocaml newcomers, @dbuenzli pacakges(https://github.com/dbuenzli) are a good example of this. :slight_smile:


#15

They are just stamps, i.e. touch foo.idem; git add foo.idem to tell ocamlbuild to activate the copy rule for foo.ml.