How to extract just what you need from a dune project

It often happens when I rely on a third-party bulky Ocaml package
that I only use a small part of it. I then proceed to create my own, private, unofficial “subpackage” of it.
In that context, it would be useful for me to have a tool that computes, given a set of source files in the project, all the other source files which are needed for that particular set.
ocamldep goes in that direction, but AFAIK what I’m looking for is not doable easily with ocamldep, at a minimum one would need a nontrivial script, not a few commands.
Since most of the packages I use are dune packages, perhaps I can take advantage of the dune metadata to do this ?
I have heard of dune-deps, but it seems way too advanced for my problem (I don’t need a fancy graph image, I just need a list of .ml/.mli files) and I would need to delve into the internals of it to convert the output to the format I want (a list of .ml/.mli files).

Sounds useful.

Rather than reasoning in terms of source files, a better approach may be to have a way to ask Dune to “extract” a set of executables or library stanzas, meaning to extract the set of dependencies needed to build the given executables/stanzas + Dune files with unnecessary stanzas removed, etc.

You may want to open a feature wish over at Issues · ocaml/dune · GitHub

Cheers,
Nicolas

Codept can compute the set of ancestors of a set of files:

codept dirname -ancestors-of basename.ml -ancestors-of basename2.ml ...

will compute the dependencies of the all files in dirname that basename.ml and basename2.ml depends on.

Then you can extract the list of modules from one of the output. In particular, the -deps output ends with an index of modules and their mapping to files.

4 Likes

Let me expand on what you just said.
If, for example, on the easy-logging package I do (as you indicated in another thread, codept . -pkg unix -pkg calendar -deps -ancestors-of easy_logging.ml) I get a JSON answer the interesting part (for me) of which is

“local” :
[{ “module” : [“Colorize”], “ml” : “./colorize.ml” },
{ “module” : [“Easy_logging”], “ml” : “./easy_logging.ml” },
{ “module” : [“Formatters”], “ml” : “./formatters.ml” },
{ “module” : [“Handlers”], “ml” : “./handlers.ml” },
{ “module” : [“Logging”], “ml” : “./logging.ml”, “mli” : “./logging.mli” },
{ “module” : [“Logging_infra”], “ml” : “./logging_infra.ml” },
{ “module” : [“Logging_types”], “ml” : “./logging_types.ml” }]

The next step is to sort the list. I tried :

codept -pkg unix -pkg calendar -sort colorize.ml easy_logging.ml formatters.ml handlers.ml logging.ml logging.mli logging_infra.ml logging_types.ml

But the output is empty! What did I do wrong ?

Also, I presume that to extract the list of files from the JSON data, I need to write a script using the Yojson package.

The -sort mode had been broken at some point, I have just pushed a fixed for this option. If you install the latest version with opam pin --dev codept:

codept *.ml *.mli -pkg unix -pkg calendar -ancestors-of easy_logging.ml -sort

prints directly

logging_types.ml logging_infra.ml colorize.ml formatters.ml handlers.ml logging.mli logging.ml easy_logging.ml

Note that you can also silence all warnings with -quiet:

codept *.ml *.mli -quiet -ancestors-of easy_logging.ml -sort