Dune builds in pkgsrs: variables, workspaces, and install dirs

I’m working on packaging a new version of Coq for pkgsrc system (cf. https://www.pkgsrc.org/). The system environment created by pkgsrc is slightly different from that created by e.g. a Debian or SUSE. In particular, the names of certain relevant executables (python, sphinx-build, etc.), and installation prefix locations are somewhat different. While some of these configuration items could be passed through environment variables, doing so is not always optimal, and it is also the case that some other build configuration items could not naturally be passed that way.

After becoming familiar with the dune documentation, I concluded that the most natural approach to accommodate the differences in system environments would be to produce a workspace definition that includes system-specific settings. This way, I hoped, the names of the executables, as well as other parameters, could be recorded in a set of global dune variables, and then applied to the code.

Unfortunately, the documentation does not appear to indicate any way to set the variables from dune-workspace specifications. The closest subsystem, described in the documentation, that comes to implementing this functionality, is the dune-configurator. While it would be relatively straightforward to use it to set variables, it is also rather awkward. Indeed, in the latter case, the build-specific values would need to be written into yet-another file (say, config/global-vars.sexp) to be subsequently read by the configurator script and applied, without regard to the concept of a workspace. This approach would introduce yet another entity for a functionality that belongs to workspace.

I also have another question. Does dune build system encode any path into any of the files that it installs? The way the installation works in pkgsc — and that mirrors a number of ther systems — is as follows: after build completes, files are copied from a build tree into staging directory, from which they are archived into a package, and that package is installed by normal system tools. I have seen the discussion on this forum of installing projects built by dune into /usr/local/, /usr/, and other system directories, and at the same time I have not seen a discussion about installation into a staging directory. Does dune make any assumptions that would break installation into a staging directory?

Finally, I’m not clear, how are the installation manifests generated (which, as I understand it, are targeted towards opam). What file sets are described in these manifests? How does opam treat those manifests? Does opam perform a simple cp operation for each listed file? Also, do the contents depend on the compilation modes? If so, is there a natural way to specify what parts of source tree should be build as native, and which should be built as bytecode; and also, how to get a failure if those constraints could not be met?

Hi @azarens, glad to see some NetBSD packaging going on! It would be easier to understand exactly what you’re trying to package first, since for OCaml/OpenBSD ports this didn’t require too many build gymnastics.

Dune will build your source code and assemble an installation layout via the dune build @install alias, which you can find in _build/install/default (for the default workspace).

dune install --destdir $DESTDIR --prefix $PREFIX should then do what you want in order to stage the files in pkgsrc. Check dune install --help for all the options you can set

Why isn’t this just done by setting the PATH in the pkgsrc wrapper for the build, to get the right executables? Some specific examples of problematic cases would be helpful, including what other build configuration items you are having trouble passing that you mention. Some things like CFLAGS (commonly set by ports-style systems) can be a little problematic as the compiler package has to deal with that.

The package directory that I’m working on is as follows:
pkgsrc/lang/coq at trunk · NetBSD/pkgsrc · GitHub

I’m more than happy to setup package builds with a nice build system (dune) on a nice packaging system (pkgsrc).

Yes, this is what the code in pkgsrc/mk/ocaml.mk is programmed to perform.

The reason why this course of action is not amenable is as follows. Some executables, particularly python-related, are renamed by pkgsrc. For instance, sphinx-build becomes sphinx-build-3.9, while python becomes python3.9.

What I’m looking for under doc/dune is something along the lines of (run env COQLIB=%{project_root} %{sphinx_build_bin} -q %{env:SPHINXWARNOPT=-W} -b latex sphinx %{targets}) rather than (run env COQLIB=%{project_root} sphinx-build -q %{env:SPHINXWARNOPT=-W} -b latex sphinx %{targets})

Thank you for this suggestion. Would it be correct to assume that the staged installation is actually well supported by dune?

I have seen in the install --help output options to specify prefix. What about specifying configdir (etc/xdg/coq/), docdir (share/doc/coq/), emacslib (share/emacs/site-lisp/), coqdocdir (tex/latex/coq/ from coq distribution)? It seems like those are paths that belong to variables.

Does dune in general respect CFLAGS coming from the environment?

Let’s consider the following segment of Makefile code from lang/coq.

PLIST_VARS+=		native
.if ${OCAML_USE_OPT_COMPILER} == "yes"
COQIDE_TYPE=		opt
PLIST.native=		yes
CONFIGURE_ARGS+=	-native-compiler yes
UNLIMIT_RESOURCES+=	stacksize # compilation of some files needs this
BUILD_TARGET=		world
.else
COQIDE_TYPE=		byte
CONFIGURE_ARGS+=	-native-compiler no
BUILD_TARGET=		byte
INSTALL_TARGET=		install-byte
.endif
PLIST_SUBST+=		COQIDE_TYPE=${COQIDE_TYPE}

We need to pass the type of build desired, and the type of build desired for coqide. I suppose that — perhaps after a discussion through the appropriate channels — we may introduce a requirement that the type of coqide build is equal to that of a coq itself. However, a decision about the type of build needs to be made early, so that relevant PLIST variables could be set. Then, the results of this decision needs to be passed to dune. As I understand it, the is_native boolean variable should be passed to dune build system as a dune variable.

A similar consideration exists for the case of native dynamic linking.

Would you suppose that these parameters should rather be passed as command-line options to dune?


Another case to consider. How would it be best to set the following directory locations with dune:

EGDIR=		${PREFIX}/share/coq/examples
CONF_FILES=	${EGDIR}/coqide-gtk2rc ${PKG_SYSCONFDIR}/xdg/coq/coqide-gtk2rc

If they were encoded into variables, doing so would be relatively trivial — up to an upstream patch to coq project, perhaps.

Current considerations.

I am currently looking forwards to produce a workspaces (say dune-workspace.pkgsrc generated from dune-workspace.pkgsrc.in) file that sets dune variables for various paths for dune build. The rest of tasks are simple adjustments to what already exists in the package. Since current package works with Makefile-based build, only dune-related issues need clarification.

Is there a canonical dune approach?

I would appreciate the perspective on what would be the most canonical approach to handling these issues with dune.

Do you think that I am making a correct assumption in thinking that the workspaces functionality is the “right” way to handle differences in build and packaging environments?

I realize that dune seeks to configure the build automatically, but there always exists (a) optional host configuration and (b) specific host configuration. The differences in binary naming with pkgsrc is a prime example of (b), while the decision which packages and with which compiler type and options should be built is an example of (a). While it is not entirely implausible that all of (a) could be passed through some standard set of command-line options to dune, it is entirely implausible that (b) could be described by any set of standardized options. I think that at least (b) does belong to workspace specifications.

Hello Anil (@avsm),

One of the very specific questions that I have at this point is the question about precise (as much as reasonably and practically possible, of course) semantics of dune regarding handling the installation of files.

  1. Does dune generate .install manifests?
  2. How does dune generate .install manifests? What inputs does it use? What decisions does it make?
  3. What is the best approach for a programmer to specify these manifests?
  4. Is it safe to cp files listed in the manifest from staging dir to a destination dir. (Based on our previous discussion, most likely, the answer is “Yes.”.)

I would appreciate input on these questions. Currently, I have been able to solve most of the build proper problems with a dune-based build of Coq on NetBSD. I also became aware of a NetBSD-specific tool for installing files given within opam manifests; the tool is called opaline, and it has a github repository thereby.

Also, there’s another point about installation that I would like to discussion.

  1. I noticed that some of the build artefacts are linked with a symlink. I would like to place regular files (not links) into the filesystem. What do the .install manifests list? Do they list links?

Hello Anil (@avsm),

Another question emerged as I tried to finalize the install of package files. It appears that not all of the files that should have been available as native code libraries were in fact build as such. Thus I wonder if there exists an approach to enable the following configuration:

  1. Request dune to build all artefacts as native code items. The invocation that I have in mind is dune build -p pkg; note there’s no @install or @all included (though, it shouldn’t really matter). A more desirable configuration would be to chose what combination of native code and byte code to build (i.e. all native + all byte / all byte / all native).

Currently, dune does allow to build individual objects of a specific type by specifying specific paths. However, that is obviously not entirely satisfactory, for obvious reasons.

Hi Aleksey,

Anil posted a link to this thread in the Dune developer chat. I haven’t read the full discussion, but here are answers to your last questions: Dune does generate .install manifests files. File <package>.install contains the listing of every file that should be installed, grouped per section.

Such files are generated based on what programmers write in their dune files. For instance, if the programmer writes something like this:

(install
 (section bin)
 (package foo)
 (files (prog.exe as prog)))

Then Dune will add the following entry to foo.install:

bin: [
  "_build/install/default/bin/prog"
]

Or if the programmer declares that a library is public, by adding a (public_name <package-name>) or (public_name <package-name>.<something>) field, then Dune will generate lib entries for the various files of the library.

Regarding question 3, programmers don’t need to think in terms of the manifest file. They should simply tell Dune what libraries and executables are public, and what additional files they want their package to install. They should do that by adding the relevant public_name fields and install stanzas. .install files are mostly an implementation detail and are meant for package management tools. If in the future a new standard emerged and .install files would become unused, Dune would switch to the new standard and it should be transparent for Dune users.

It should be mostly safe to copy files from the staging directory to the destination directory. Although, the two current consumers of these files (opam-installer and dune install) both do a bit more, more precisely they both set exact permissions on the destination files.

However, a recent version of Dune introduced a new “site” feature which breaks the assumption that files can simply be copied and their permissions set. Indeed it requires some files to be rewritten at installation time to set in stone a few runtime parameters. dune install performs this rewritting but opam-installer does not, which means that packages using the “site” feature must be installed via dune install. We recently discovered this issue and haven’t yet made a decision as to how to move forward. So this is a good time to chime in if you’d like to bring a more package manegement point of view. The relevant issue to discuss this is this one.

Yes, they do list symbolic links. During the build, Dune setups symbolic links in the _build/install/default (or more generally _build/install/<build-context> directory) pointing to the real artefacts. This directory mimics the installation layout. This allows to have tests that works “as if the package had been installed”. These symbolic links are the one listed in .install files.

Do you have mind to take the list of files listed in the .install file, partition it in two (byte and native), and then have a simple way to ask Dune to build one of the two partitions? If yes, that could be done but there are some pitfalls. For instance, if the programmer wrote a rule to generate a source file using a generator that is locally built, then unless the programmer explicitly requests to use the bytecode version of the generator, Dune will use whichever version it thinks is best. More precisely it will use the native version if the native compiler is available. This means that requesting the bytecode artefacts could still involve native compilation. You could get away by hiding the native compiler during bytecode compilation, but then you’d loose sharing between the two builds. Anyway, nothing impossible here, but more designs and thinking is required. In the end, it is probably easier to ask Dune to build everything and then split the result in two binary packages.

1 Like