Dune, cmake and -j

My project contains a vendored library which is built using cmake via the following stanza in a dune file.

(subdir libortools-build
  (rule
    (deps (source_tree ../libortools))
    (targets (dir build))
    (action (progn
      (run cmake
             -DCMAKE_CXX_STANDARD=20
             -DBUILD_absl=ON
             ...
             -DUSE_GUROBI=ON
             -S ../libortools
             -B ./build)
       ...
      (run cmake --build ./build --config Release -j)
    ))
  )
)

Without the -j option, the build is too slow. With the -j option, the build overloads the machine. Ideally, there would be some way of controlling the number of parallel jobs without requiring users to know about and set the CMAKE_BUILD_PARALLEL_LEVEL environment variable prior to running dune or opam.

This topic has been discussed several times in the past, I understand that the ideal solution is for cmake and dune, and also opam, to work together to share a declared level of parallelism, possibly using a “jobserver protocol”. Is that still the current state of things? Or is there an alternative solution?

What would be wrong with the idea of requesting some parallelism in the rule, e.g., (max_parallel_jobs 12), and then having a variable return the granted parallelism, e.g., (run cmake … -j %{granted_parallel_jobs})? Is this too naive?

Until there is coordination with a job server I’m afraid it’s all going to be hacks.

A middle ground could be to set a bound to the number of jobs run by cmake. If dune and cmake are bounded in the number of jobs, at worst you’ll have twice as many jobs as the number of cores in flight, if I’m not mistaken, and that’s bit better than an unbounded number…

opam var jobs returns the number of jobs. That’s portable. If you use opam, you could hack something with the setenv Dune action and CMAKE_BUILD_PARALLEL_LEVEL, then call cmake.

You could also build the foreign library from the opam file, before calling Dune, pass the opam jobs variable to cmake as opam does for dune (see your .opam file) then import the archive built from cmake into the dune build process as a foreign archive.

I do agree that at the very least Dune should re-expose the number of jobs it’s using as a variable.

1 Like

Actually, when I use Dune’s foreign build sandboxing, it happens often that Dune waits for the build of the foreign library to finish, with just one job in flight. In that case, the foreign build system could freely use all of the possible cores, since Dune doesn’t execute any job in parallel with the foreign build system.
I haven’t found a way to have Dune dump the number of parallel jobs it’s configured to use, but with opam:

(rule
 (action
  (with-stdout-to jobs
   ; conservative number of concurrent jobs
   (bash "opam var jobs 2>/dev/null || echo 2"))))
; …
(rule ; …
 (action ; …
  (run make -j%{read-lines:jobs})))

This stanza requires bash, it’s sort of portable on Windows if executed in a Cygwin environment.

1 Like

Thanks @Rucikir for the information and for sharing your dune file. That might well be the best solution for me in the short term.

I opened feat: add the %{jobs} variable #13555 on Dune, and @rgrinberg, the maintainer, made some really good points about it.

Just a heads up: this will feature make it so that your make rule will pointlessly re-run whenever you change the number of jobs. If you’re fine with that, then it’s fine. But we should really not be recommending this pattern.

It depends if the build is sandboxed. If it is, make will not have access to its previous build artifacts will do a clean build. Sandboxing of course is much safer and in general all non trivial and manually written makefiles do not specify their dependencies correctly.

The alternative are all fundamentally unsafe and rely on actions being co-operative. One alternative might be dune export its j value in some sort of environment variable that is visible to actions. That will not affect the digest of the action.

In short: if %{jobs} appears in a rule, any change to the number of jobs will invalidate the build as the rule has changed. It is a bit wild to enable parallelism by default on Makefiles because a lot of them are broken; but if you use a build system such as CMake or Meson, then the foreign build has much more chances of being correct and supporting parallelism. Meson uses Ninja for its backend, and CMake can use Ninja if passed -G Ninja (cmake-generators(7)). Ninja will then autodetect the number of CPUs it can use, and voilà (to be honest I haven’t tested that last bit but I think it’ll work). You can safely depend on Ninja/Meson/CMake, which are all very mature and widely-used build systems.

1 Like

That’s really good of you! :smile: Thanks.

I’m looking for a solution for an OCaml wrapper to Google’s OR-Tools (just part of CP-SAT for now). The underlying build system is cmake. The parallel build works well enough, unless it’s given access to all the processors, which, surprisingly, causes my Debian 13.3 box to freeze. A non-parallel build is way too slow; that’s how I got into all of this.

There is also a discussion about this question and a proposal for a request/grant mechanism building on a suggestion by @rlepigre , but I just don’t have enough competence and time at present to work on a pull request.

1 Like