It is currently unclear how to package and distribute an OCaml package that binds a piece of Rust code. Rust is a popular language, especially for cryptography, and it’s safer than C, making it an important technology for the OCaml ecosystem.
In this post we explore a few issues and possible solutions to integrate Cargo and Opam, hoping to start a discussion with the community. At the end we describe the concrete use case of Tezos that sparked this investigation at Nomadic Labs.
The main problem
Rust does not have a stable ABI and the approach taken is to always recompile from sources in a local environment (like a local _opam switch).
The official approach is to discourage the installation of compiled libraries and only install executables.
Opam on the other hand allows to install compiled libraries, so a library that binds a piece of Rust code, and links it, breaks the invariant of Rust.
In particular there is the risk of linking to several versions of the same Rust dependency because the constraints are solved separately.
A possible solution for Opam to respect this invariant could be that at each installation of an executable package containing Rust code, all installed libraries containing Rust code should be recompiled.
A somewhat similar solution is employed by Debian: “So, we can’t reasonably ship compiled versions of Rust libraries. Instead, library packages ship source code, and application packages build all the library crates they use from source.”
How to bind Rust
The approach used to bind an existing Rust crate foo
is to:
- write a pure Rust crate
foo-ffi
which exposes selected functions offoo
in a C compatible way, using Rust standard FFI. - write a binding
ocaml-foo
which bindsfoo-ffi
, either with hand-written stubs or c-types.
Cargo and Opam
Cargo is roughly the equivalent of dune + opam, it takes care of compilation using the compiler rustc
, dependencies and packaging.
Cargo.toml
is equivalent to an .opam file and among other things it declares dependencies.
Cargo.lock
contains the result of solving the dependencies constraints declared in the Cargo.toml files. It contains the exact version of each crate that will be downloaded and compiled. Lock files should be checked in the sources of binaries to have reproducible builds, but they should not be included for libraries.
Finding dependencies
Online
Normally cargo build
would download all required dependencies, which is forbidden by Opam’s sandbox.
We could make an exception in Opam’s sandbox for cargo build
to contact exclusively the official repository crates .io.
Alternatively we could add a new backend for depext
to run cargo build
, much like an external package manager.
A related discussion can be found in issue 3460.
Offline
Dependencies can also be vendorized using cargo vendor
or can be downloaded to a local registry like done by Debian. The two solution are almost equivalent.
An example of vendoring can be found in batsat-ocaml.
Finding a compiled library to link to
Find the “local switch”
Every run of cargo build
is done in a local switch (in opam terms) which is by default located in the directory target/release
next to the Cargo.toml that defines the build.
This directory can be found by dune/opam using cargo metadata
and can be customized (e.g. to be inside an opam switch).
Installed in the system
An FFI library is stable with respect to C’s ABI, so it could be installed in the system. There is no cargo support for this though.
Sharing dependencies in a workspace
When building multiple crates for a project, it is important to solve all dependencies at the same time to share a maximum of code and avoid linking two versions of the same library.
This can be achieved by defining a cargo workspace
that declares the crates as members and that will result in a single build directory for all of them, with dependencies shared.
Note that this solution works at the root of a project but can’t be nested in subprojects as nested workspaces are not supported by cargo.
Tezos case
There are currently two Rust libraries that Nomadic Labs wrote OCaml bindings for, librustzcash and bls12-381. These libraries share a large number of dependencies.
The solution currently used to build Tezos from source is to declare a Cargo workspace at the root of the project with the two vendored Rust libraries as members. This ensures that dependencies are solved at the same time and the result can be inspected in the resulting Cargo.lock which is committed to git.
Inside both OCaml bindings Dune uses cargo metadata
to find the workspace root and the compiled .a libraries to link to.
The question now is how to obtain a similar result when installing Tezos from an Opam package. Hence the above discussion.