Many very good points were raised already, so I’ll just add a couple of historical notes and personal preferences.
The “one compilation unit = one .mli interface file + one .ml implementation file” design goes back to Caml Light and was taken from Modula-2, Wirth’s excellent Pascal successor. As previously mentioned, it works great with separate compilation and parallel “make”.
But the main point in favor, in my opinion, is that it clearly separates the user’s view of the module (the .mli file) from the implementor’s view (the .ml file). In particular, documentation comments on how to use the module go to the .mli file; comments about the implementation go to the .ml file. This stands in sharp contrast with most of the Java code I’ve written and read, where comments are an awful mixture of documentation comments and implementation comments, and serious IDE support is needed to see just the user’s view of a class and nothingelse.
Also, in the .mli file declarations can be listed in the most natural / pedagogical order, starting with the most useful functions and finishing with less common ones, while definitions in .ml files must come in bottom-up dependency order.
OCaml combines this Modula-2 approach to compilation units with a Standard ML-like language of modules, featuring nested structures, functors, and multiple views of structures. The latter is a rarely-used but cool feature whereas a given structure can be constrained by several signatures to give different views on the implementation, e.g. one with full type abstraction for use by the general public and another with more transparent types for use by “friend” modules. Again, and perhaps even more than in Modula-2, it makes sense to separate structures (implementations) from signatures (interfaces) that control how much of the implementation is visible.