Circular linker dependencies for functions and types

jfehrle · March 30, 2021, 2:38am

If I have 2 routines in different files that are mutually recursive, OCaml complains if they try to refer to each other in the obvious, naive way. But, as the manual describes, you can around get limitation fairly simply like this:

Use a reference to hold one of the two functions, as in:

mod1.ml:
let forward_g =
ref((fun x → failwith “forward_g”) : )
let f x = … !forward_g …

mod2.ml:
let g y = … Mod1.f …
let _ = Mod1.forward_g := g

On the other hand, AFAICT, it seems that mutually recursive types must be defined in the same file and joined with an “and”. For example, if I have existing code defining 2 records that now need to become mutually recursive, I must refactor the types so they are in the same file.

Or (the case I actually encountered) suppose I have 2 source files A and B respectively defining typeA and typeB. If A calls a routine in B, then if I try to have typeB refer to typeA, the linker will again complain. I’d need to move typeB into A.mli or create a new .mli file with both typeA and typeB.

Is there any way to do this without moving type definitions between files? It would be helpful if there was a way to do so without having to refactor. OTOH this may be difficult to impossible to support.

PS It would be great if Ocaml permitted forward reference to functions defined in the same file without special syntax. (Or second best, with a forward declaration). Fairly often I find I need to reorder routines in a file in order to support new functionality. Sometimes this is nontrivial. The resulting block moves in the git history suggest that the changes were more significant than they are. In addition, the git history doesn’t make it easy to find changes that were made to a moved routine.

vlaviron · March 30, 2021, 10:24am

Regarding types: you can sometimes get around your problem if one of the modules doesn’t need to know the exact shape of the types from the other module. In that case, you can use a type variable instead of the exact type in your first module, and your second module would close the recursion by defining its own type, which can then contain arbitrary references to both itself and the other type.
Example:

(* Target definitions *)
type typeA = A of typeA * typeB
and typeB = B of typeB * typeA

(* a.ml *)
type 'typeB typeA = A of 'typeB typeA * 'typeB

(* b.ml *)
type typeB = B of typeB * typeB A.typeA
 (* Or alternatively:
type typeB = B of typeB * typeA
and typeA = typeB A.typeA
*)

This doesn’t work if a.ml needs to know the concrete representation of typeB of course.
In that case, there are roughly two solutions with different trade-offs:

Put all your types in a single file (you’re already aware of this solution of course). It used to be fashionable to have, in a project or library, a single .mli-only file that would hold all the type definitions that needed to be shared. But it’s going out of style, and I can understand why.
Parametrize with functors. One of the modules will be replaced by a functor taking as argument everything you need to use from the other module (types and values). Then the other one will use recursive modules to define both itself and the application of the functor to itself. You still need a few more tricks if your types are mutually recursive, so in practice this is likely not going to help much.

There is also some proposal to allow several files to be compiled together as if they were mutualy recursive modules, which should be rather convenient for you, but the proposal isn’t merged yet and you still get a few bad consequences of using recursive modules. You can see a description of the proposal here, and I believe there’s an actual implementation lying around too.

Finally, about forward references: one of the issues is that in OCaml functions are not static objects like, for example, in C. Instead, defining a function dynamically allocates the corresponding structure. It is obviously not possible to have code like:

let x = y + 1
let y = x + 1

So for functions the problem is the same.
The native compiler does have a notion of static function (basically a closed function), so in theory one could propose an extension that allows forward declarations for such functions, but as far as I remember the bytecode compiler doesn’t have the corresponding mechanisms so you would end up with code that would work on native but fail to compile in bytecode (although maybe emulation with references would work ?).
For the refactoring issue, there is one possible solution that would work with current compilers:

(* Initial code *)
let f x = ...
(* Other code *)
let g y = ... f x ...

(* Refactoring: now f needs to call g too *)
let f_0 g x = ... g y ...
(* Other code *)
let rec g y = ... f x ...
and f x = f_0 g x

With the right inlining annotations, you should be able to get the same code as if you had moved the definition of f completely.

Topic		Replies	Views
How to deal with "recursive" modules? Learning module	10	5498	June 21, 2018
Value restriction and mutually recursive functions Learning	3	3055	August 11, 2018
Type inference and recursive modules Learning	4	889	October 5, 2021
Question about the scope of recursive modules Learning	4	626	May 18, 2020
Explicit type binding and mutual recursion Learning	2	1052	June 4, 2022

Circular linker dependencies for functions and types

Related topics