Experience Report: Refining Dune’s Dependency Graph

Refining Dune’s Dependency Graph: Per-Module Library Filtering

I’ve been working on improving Dune’s inter-library dependency tracking, and wanted to share the experience — both the technical details and what it’s like as a first-time contributor to this large open source OCaml project.

The Problem I Took On

When libA depends on libB, Dune gives every module in libA a glob dependency on all .cmi files in libB. If any .cmi in libB changes, every module in libA is recompiled — even modules that never reference libB.

For projects with many libraries, this creates a cascade of unnecessary recompilations. The issue that tracks this matter #4572 has been open since 2021.

My Approach

Dune already runs ocamldep to compute intra-library module dependencies. The key insight: that same output tells us which libraries each module references, via their entry module names. We can use this to filter both the build dependencies and the -I/-H compiler flags per-module.

The implementation (PR #14116 and PR #14186) works as follows:

  1. For each module and its transitive intra-library dependencies, read the ocamldep output (both .ml and .mli)
  2. Union all referenced module names, including -open flags
  3. Map those names to libraries via a Lib_index
  4. Transitively close the filtered library set via Lib.closure
  5. Use the result for both hidden deps and -I/-H compiler flags, partitioning into direct (visible via -I) and hidden (via -H) based on requires_compile membership

With both deps and flags filtered, a clean build will fail if a module references a library it doesn’t declare — previously, overly-broad -I flags could mask such errors.

A False Start

My first attempt (PR #14021) tried to implement the filtering in a single PR without sufficient test coverage. It was closed after review revealed that the approach was fragile in edge cases I hadn’t anticipated — particularly around transparent module aliases and virtual libraries.

Challenges

Transparent module aliases. OCaml’s module alias mechanism means ocamldep doesn’t always report all libraries a module transitively depends on. If libB has module M = LibC.Something, and a module in libA uses LibB.M, ocamldep reports LibB but not LibC. The fix: transitively close the filtered library set using Lib.closure, bounded by the compilation context.

Root modules. The (root_module) stanza creates a module that implicitly aliases all libraries in the compilation context. When ocamldep reports a reference to a root module, we can’t determine which underlying libraries are actually needed, so we fall back to the full dependency set.

Virtual libraries. When virtual library implementations are present in the compilation context, parameter libraries may not appear in requires_compile, so filtering could miss them. Another fallback case.

Menhir-generated modules. These mock modules aren’t in the ocamldep dependency graph, so we skip filtering for them.

Null build overhead. The filtering reads .d files and computes library closures per-module. On a fresh dune process (no memo cache), this is new work on every build — including null builds where nothing changed. This is a real trade-off: better incremental rebuild performance at the cost of some null-build overhead.

Prerequisite Test PRs

Before the implementation PRs, I submitted six test-only PRs to document existing behavior and establish a safety net:

  • #14017 — Baseline tests documenting current inter-library recompilation behavior
  • #14031 — Test documenting module name shadowing between stanzas and libraries
  • #14100 — Test verifying library file deps in compilation rules and sandboxed builds
  • #14101 — Test verifying transparent alias incremental build safety
  • #14129 — Test verifying incremental builds with alias re-exported libraries
  • #14178 — Test documenting ocamldep behavior with transparent alias chains

This made the implementation PRs’ diffs focused on the actual change, and gave reviewers confidence that existing behavior was preserved. It also helped me understand the edge cases that tripped up my first attempt.

The Review Process

The Dune maintainers (@rgrinberg and @art-w) provided thorough, constructive reviews. Some highlights:

  • Replacing my hand-rolled transitive closure with Lib.closure from the existing library — a cleaner approach I wouldn’t have found without familiarity with Dune’s internals
  • Identifying that both .ml and .mli ocamldep output need to be read, since the interface can reference different libraries than the implementation
  • Suggesting per-module -I/-H flag filtering, which makes clean builds more precise and improves caching
  • Questioning every fallback case and special-cased module kind, leading to simpler code

The PRs went through significant refactoring during review — the final versions are substantially tighter than the initial submissions.

What Could Be Better

Working on this was a positive experience overall, but a few things created friction:

No way to benchmark before merging. The null-build overhead question came up late in the process. I discovered through manual benchmarking that the change added ~70% to null build time — a significant regression. Dune’s benchmark CI workflow runs only on pushes to main, not on PRs. Contributor-accessible performance tooling would help catch regressions before they land.

Review momentum vs. rebasing. The test PRs merged quickly, but the implementation PR required multiple rounds of review over days. Between rounds, main moves forward, requiring rebases that risk introducing conflicts. The contributor carries the burden of keeping branches fresh. This is compounded when PRs depend on each other — every rebase of #14116 required rebasing #14186 as well. GitHub has no first-class support for PR stacks, so this is manual and error-prone. Of course, all GitHub-hosted repos suffer from this.

Flaky CI. Many CI runs had errors that were not related to my code. It was often an upstream provider of an OCaml package that was unreachable or faulty (temporarily). These problems often resolved themselves, but caused day-long delays in the PR lifetimes. The problem stems from the setup code that is run and re-run over and over in CI jobs.

Reflections

The Dune codebase is well-structured, with clear separation between the build engine, rule generation, and scheduler. It is also of good quality, making it feel like time spent on keeping the quality high is worthwhile.

I found the cram test infrastructure good for testing. Each test scenario is a self-contained shell script with expected output, making it easy to document and verify exact recompilation behavior. It inspires confidence in the code.

The maintainers have been responsive and the review process, while slowed by thoroughness, is collaborative and professional. Thank you, maintainers!

16 Likes

It does now: GitHub Stacked PRs | GitHub Stacked PRs

1 Like

Fascinating report! I am curious about the efficiency impact of repeated transitive-closure computations. Can you link to the CI benchmark reports and the discussions of the efficiency cost?

I would assume that it is reasonably easy to cache transitive dependency computations. In particular, if a module A depends on C and D, you don’t need to compute the closure of {C, D}, you can compute the closure of C and the closure of D, and union them. Then if B depends on D and E, all the work computing the closure of D can be shared. You need to compute dependency closure at most once per dependency, so intuitively it should be the same amount of work to compute dependencies of a whole library (set of modules) or compute dependencies of each of its modules separately.

2 Likes

Welcome to dune development and many thanks for your PRs!

It’s very discrete because it’s generally green, but the CI also ran ocaml-benchmarks on your PRs :slight_smile: This is most likely where the 70% regression on null builds was observed: CI benchmarks for #4572 (4s to 7s on dune codebase itself for null builds, no changes for clean builds).

As noted by @rgrinberg, we’ll need to fix this regression before being able to merge your PRs: A null build (= no changes) roughly measures the minimum amount of analysis that dune needs to do to realize that there’s nothing to do (!)… so in their current state, the PRs would make every build slower to enable an optimization that doesn’t save as much time in general (*). Fortunately @gasche 's insight is spot on! (Currently the transitive closures are recomputed from scratch for every module, with no reuse of the redundant work… I’m hopeful that we can fix it!)

(*) @rgrinberg gave an example of two modules where we would have expected a speedup, one depending on a “slow” library that takes a lot of time to build and the other depending on a “fast” library. Before your PRs, observing the build on two cores would have looked like:

CPU1: [build slow library .......] -> [build use_slow.ml .....]
CPU2: [build fast lib]             -> [build use_fast.ml] (waiting for "slow")

With the more precise analysis of library dependencies per module, the build of use_fast.ml doesn’t need to wait for the slow library to finish building:

CPU1: [build slow library .......] -> [build use_slow.ml .....] |
CPU2: [build fast lib] -> [build use_fast.ml]                   |
                                                                | build done!

So sadly, given the pre-existing opportunities for build concurrency, the total build time remains the same in this example. But we are still interested by this optim because it does improve the build graph with more opportunities for concurrency and caching! Concretely in the example, it would speedup things if use_fast.ml was the one with a slow compile time; and on an incremental change to the “fast” lib, skipping the use_slow.ml unnecessary rebuild would also be more efficient :slight_smile:

9 Likes

Thanks, @art-w and @gasche for your ideas. Yes, this re-computation of closures is inefficient and @rgrinberg has already produced a commit that I have cherry-picked into the 14116 branch. It seems to make a difference, but not by much (6.6 seconds null build versus 4.0 on ‘main’). The investigation continues (and the discussion remains on GitHub).

Yeah, this is probably the main issue with scaling this optimization to larger workspaces. In fact, even without this optimization, this is an area that dune handles poorly. This optimization just exposes this weakness more.

Note that this work is not yet merged, nor necessarily guaranteed to merge. It surfaces problems that must first be addressed, or at least understood.

Does Dune detect that a file is unchanged based on a hash or file attributes and can re-use dependency information?

Yes, all 3. It uses file modification time to decide whether other changes must be checked for. It uses hashes to confirm change and to enable early cut-off. It re-uses dependency information across runs by caching dependency information.

1 Like