Proposal: a new `exports` field in `findlib` META files

Proposal: exports field in findlib META files

Note: in what follows I use, following dune terminology, the term library names for what is formally ocamlfind package names.

Recently I have deprecated a few libraries names in some packages. Namely vg.svg, uunf.string and uuseg.string.

To do so I made empty libraries in the META files installed by these packages. These empty libraries simply requires their replacement library and warn on usage with a warning field (sample).

However this can’t work if you have a build system with correct dependencies – a.k.a implicit_transitive_deps set to false in dune.

In this mode the includes (-I) of the requires of a specified library dependency are not added during the compilation phase. So using the requires field to proxy a library for another one can’t work.

While it seems that correct dependencies break in all sorts of obscure ways, upstream has integrated support for hidden includes (-H) in the upcoming OCaml 5.2. But in order to be able to proper use it we need to extend the metadata we keep about libraries to know which of the libraries need to be included with -H and those that need to be included with -I.

Proposal

Add an exports field to META files which has exactly the same syntax as requires, a list of library names. Libraries mentioned in a library’s requires are not exposed to code compiling against the library while those in exports are.

In the compilation phase this means that libraries mentioned in requires get included with -H while those in exports get included with -I.

More precisely the semantics of requires and exports are as a follows.

Compiling with correct dependencies (OCaml >= 5.2)

Given a sequence of library names libs to compile and link against.

Let exported_libs be the smallest set of library names that includes libs, and their
transitive exports.

Let hidden_libs be the smallest set of library names that includes the transitive requires of libs, the transitive requires of the exports of libs and the transitive exports of any library found in requires that are not in exported_libs.

  1. Compilation phase. For each library in the set of exported_libs we include its library directory with a -I. For each of library in hidden_libs with include its library directory with a -H.

  2. Linking phase. The library archives of hidden_libs and exported_libs need to be provided, sorted in stable topological order.

Compiling with incorrect dependencies (OCaml < 5.2)

Given a sequence of library names libs to compile and link against.

Let overshoot_libs be the smallest set of library names that includes libs and their transitive exports and requires library names.

  1. Compilation phase. For each of the library in the set overshoot_libs we include its library directory with -I.

  2. Linking phase. The library archives of overshoot_libs need to be provided, sorted in stable topological order.

Backward compatibility

For backward compatibility with systems that do not understand exports and are oblivious of -H. In META files, the library names mentioned in the exports field of a library must be replicated in its requires field. Implementers of correct dependencies must remove from requires the names that exist in exports.

Since ocamlfind doesn’t mind fields it doesn’t know, no changes are immediately needed in ocamlfind and the proposal can be used from now on.

Usage

The exports field can be used for two things:

  1. Along with a warning field a library name can be deprecated and be transparently replaced by a list of other libraries.
  2. A library name can be defined to define a “meta” library which represents a bunch of other libraries against which to compile and link. This library can also provide code itself.

Note that these two usages formally existed in the incorrect dependency world but is no longer possible in the correct dependency world. The exports field allows to bring it back.

2 Likes

Just for background: implicit_transitive_deps still has experimental status precisely because of this problem (as it can trigger the “obscure” problems that you mention). Fixing this is what motivated the addition of -H to the compiler to a large extent. Now that -H has been added to the compiler, implicit_transitive_deps will be patched Add support for the new -H <dir> argument in OCaml 5.2 · Issue #9333 · ocaml/dune · GitHub to use it in the way you describe (adding -H for the libraries in the requires field).

In other words, I am not sure I understand the need for a new represents field; rather implicit_transitive_deps (and any other build system that wants to use non-transitive dependencies) should be adapted to make use of -H. Doing so also fixes the “obscure” bugs mentioned before.

Cheers,
Nicolas

Using a library’s API should not, in general, entail usage of the API of its requires as is the case now (i.e. we should not -I the library directories of requires); if you don’t understand why see here. -H solves this problem.

Now consider the usage section of my proposal and try to solve the cases I mention with -H. You can’t.

You need a way for libraries to indicate which of their dependencies they consider as abstract (requires, -H) and those they don’t (represents, -I). Changes to requires should not break users of libraries while changes to represents do.

We can certainly do without but this entails that:

  1. We can’t deprecate library names without breaking users.
  2. We can’t have library names like my-currated-set-of-libs that stand for using the API of a bunch of other libraries.
2 Likes

Indeed, I hadn’t read the proposal carefully enough. I agree those cases are impossible to achieve today in the presence of non-transitive dependencies. The proposal makes sense to me now, thanks!

Cheers,
Nicolas

1 Like

It can also be solved with a new “zorglub” field.

I can’t figure out what you are talking about.

Please try to contextualize your sentences and make your proposals precise.

Feel free to propose something better.

To be perfectly honest this has absolutely nothing to do with dune and everything to do with OCaml’s unspecified and loose (or broken until we get -H if you ask me) compilation model. If you don’t understand why read the first sentence of this message.

No, read the last paragraph of the proposal. If ocamlfind is not changed to support -H and -I and interpret represents compiling through ocamlfind itself will simply compile with imprecise dependencies. Personally I just use ocamlfind’s META files via ocamlfind query, so I don’t need changes in ocamlfind, YMMV.

What you call represents: foo feels like include: foo to me, at least in case where you want to represent/include several things.

More precisely, my impression is that represents names the reason why one would want to use the feature (but I am not sure that this covers all reasonable reasons), while include names what the feature does.

Maybe two realistic examples (one where you represent/include a single library, the other where you represent/include several) would help get a feeling for which names read better in action.

Alternatives more on the reasons sides, there would be provides or exposes.

I quite like the symmetry requires/provides but I find it slightly misleading since the library itself may also provide something. The same problem exists with represents or exposes it represents/exposes the libraries mentioned therein but using the library may not only do that.

includes (with an s, like requires it applies to the library you define) is maybe the best option. Using the library includes using these other libraries. Though I don’t necessarily like the overloading of the term with concrete includes.

If I understand correctly the proposed feature (which I am really unsure), then following the terminology from Coq, exports could make sense (which is not that different from exposes).

1 Like

I suppose you are refering to this. I would quite prefer exports to includes, that way there’s no terminology overloading.

EDIT: rresult, not rrequire

I’m familiar with CMake “usage requirements” (Key Concepts - Usage Requirements - Mastering CMake) but I’m not asking that anyone adopt that language. I just want to know if my mental model is right. I’ll use base64 -> rresult as an example because rresult was linked previously.

  • rresult consumes result.
  • rresult consumes ocaml (more accurately stdlib. Wish I had a better example!).
  • base64 consumes rresult.
  • transitivity? base64 should consume ocaml.
  • transitivity? base64 should not consume result.

Since transitivity should be controllable, we need two types of “consumes” edges. The binary choice is controlled by -I or a new -H option.

In the language of CMake:

  • rresult has a PRIVATE dependency on result. That means any consumers of rresult do not transitively consume result.
  • rresult has a PUBLIC dependency on ocaml. That means any consumers of rresult do transitively consume ocaml.
  • This bit is for completeness and is probably irrelevant for this thread. These PRIVATE/PUBLIC declarations are not just for consumers of rresult but also determine which libraries+flags are used during the compilation of rresult. If you want to export a library+flag and not require it during compilation (ex. closely maps to Dune’s virtual library concept), there is a third INTERFACE declaration.

Is that mental model above accurate?

Mostly except I find the term “consuming” a bit misleading. The thing to keep in mind is that all that is modulated by the build phase, that is compilation and linking.

In your example in the compilation phase, using base64 must not automatically expose result’s API (the library’s cmi and cmx files). But in the linking phase, using base64 must automatically link against result’s library archive (the cmxa file) which I personally would count as “consuming”.

1 Like

In your example in the compilation phase, using base64 must not automatically expose result ’s API (the library’s cmi and cmx files). But in the linking phase, using base64 must automatically link against result ’s library archive (the cmxa file) which I personally would count as “consuming”.

Ah; my mental model needs to be updated. The edges are really between groupings of library artifacts (ex. [lib\rresult\rresult.cmxa] is the linking artifact group and [lib\rresult\rresult.cmi; lib\rresult\rresult.cmx] is the compilation artifact group). My former mental model was that edges were between libraries (ex. lib\rresult\META).

That helps tremendously.


The immediate problem is we can’t express transitivity for a library’s “API” artifacts (aka. the compilation artifact group). This proposal definitely solves the immediate problem.

The general problem is that we can’t express transitivity for individual library artifacts. I don’t know of any concrete examples that could be solved by immediately skipping to the general problem.

+1 to the proposal

C’'mon dude. I love your stuff, but why are you inventing new terms like “consumes”? Does that not mean “depends on”? Occam’s razor: do not add crap that isn’t necessary, in so many words.

" In your example in the compilation phase, using base64 must not automatically expose result ’s API (the library’s cmi and cmx files)."

When did the cmx files become part of the API?

Whether or not cmx files are dependencies depends on whether or not -opaque is passed.

Let’s say result’s API objects, happy now ?

Define “API objects”.

I confess I do not understand how sloppy language helps anybody.

I suggest you open an issue upstream to clarify all this sloppy language :–) Good luck.

Case in point: “correct” v. “incorrect” deps. There is nothing incorrect about indirect dependencies.

From my point of view as maintainer of the part of the compiler that deals with cmx files, this is slightly wrong. An implementation never “depends” on other cmx files, as you can always compile just from the cmi files (you might have to disable warning 58 if the interfaces were not compiled with -opaque). If the cmx files are present, then further optimisations can be enabled, at the cost that you must make sure that the cmx files read during compilation and passed to the linker are coherent (otherwise the linker will throw an error).

However, if you design a build system it is likely a good idea to register build dependencies between cmx files, to make sure that cross module optimisations happen as expected.

Oh there is, you get opam packages and libraries with incorrect dependency specifications.

Could you maybe get a grip on the actual problem, it would make the discussion more productive.