[ANN] Robur Reproducible Builds

Robur Reproducible Builds

Over the past year we in Robur have been working towards easing deployment of reproducible mirage applications. The work has been funded by the Eurepean Union under the Next Generation Internet (NGI Pointer) initiative. The result is available as a website.

The overall goal is to push MirageOS into production in a trustworthy way. We worked on reproducible builds for MirageOS - with the infrastructure being reproducible itself. A handful of core packages are hosted there (described below in this article), next to several ready-to-use MirageOS unikernels - ranging from authoritative DNS servers (secondary, let’s encrypt DNS solver), DNS-and-DHCP service (similar to dnsmasq), TLS reverse proxy, Unipi - a web server that delivers content from a git repository, DNS resolver, CalDAV server, and of course your own MirageOS unikernel.

Reproducible builds are crucial for supply chain security - everyone can reproduce the exact same binary (by using the same sources and environment), without reproducible builds we would not publish binaries.

Reproducible builds are also great for fleet management: by inspecting the hash of the binary that is executed, we can figure out which versions of which packages are in the unikernel - and suggest updates if newer builds are available or if a used packages has a security flaw in that version – albatross-client-local update my-unikernel is everything needed for an update.

In the following, we’ll explain in more detail two scenarios: how to deploy MirageOS unikernels using the infrastructure we provide, how to bootstrap and run the infrastructure for yourself. Afterwards we briefly describe how to reproduce a package, and what are our core packages and their relationships.

Brief robur and MirageOS introduction

MirageOS is an operating system, developed in OCaml, which produces unikernels. A unikernel serves a single purpose and is a single process, i.e. only has the really needed dependencies. For example, an OpenVPN endpoint does neither include persistent storage (block device, file system) nor user management. MirageOS unikernels are developed in OCaml, a statically typed and type-safe programming language - which avoids common pitfalls from the grounds up (spatial and temporal memory safety issues).

Robur is a collective that develops MirageOS and OCaml software with open source license. It was started in 2017, and is part of the non-profit company center for the cultivation of technology. We received funding from several projects (prototypefund, NGI pointer), donations, and some commercial contracts.

For someone who wants to run MirageOS unikernels

To run a MirageOS unikernel on your laptop or computer with virtualization extensions (VT-x - KVM/BHyve), you can first install solo5-hvt as a package (take which fits your distribution), and albatross.

There is no configuration needed, you should start the albatross_console and the albatross_daemon service (via systemctl daemon-reload ; systemctl start albatross_daemon on Linnux or service albatross_daemon start on FreeBSD). Executing albatross-client-local info should return success (exit code 0) and no running unikernel. You may need to be in the albatross group, or change the permissions of the Unix domain socket (vmmd.sock in /run/albatross/util/ on Linux, /var/run/albatross/util/ on FreeBSD).

Network setup

To setup networking, you need a bridge interface, usually named service, that albatross will use for unikernels. To provide network connectivity to that bridge interface, you can either use NAT, forward public IP addresses there, provide a gateway that tunnels via VPN, or add your network interface to the bridge. In the following, we describe the setup in detail on Linux. Get in touch with us if you’re interested in other platforms.

Bridge setup on Linux in /etc/network/interfaces:

auto service
# Host-only bridge
iface service inet manual
    up ip link add service-master address 02:00:00:00:00:01 type dummy
    up ip link set dev service-master up
    up ip link add service type bridge
    up ip link set dev service-master master service
    up ip link set dev service up
    down ip link del service
    down ip link del service-master

Routing of a subnet

If your host system acts as a router for a network, enable IPv4 forwarding ( echo "1" > /proc/sys/net/ipv4/ip_forward), and setup that IP address (up ip addr add 192.168.0.1/24 dev service)

Physical network interface with IP address space

To put your unikernels on the same network as your host system, add that external network interface to the bridge: up ip link set dev enp0s20f0 master service.

NAT (no public IP address, e.g. for testing on your Laptop)

Setup a private network on the service bridge (up ip addr add 192.168.0.1/24 dev service), enable IPv4 forwarding (echo "1" > /proc/sys/net/ipv4/ip_forward), and a firewall rule (iptables -t nat -A POSTROUTING -o enp0s20f0 -j MASQUERADE).

Unikernel execution

Download the traceroute unikernel (direct link to unikernel image), and run it via albatross: in one shell, observe the console output: albatross-client-local console traceroute, in a second shell create the unikernel: albatross-client-local create --net=service traceroute traceroute.hvt --arg='--ipv4=192.168.0.2/24' --arg='--ipv4-gateway=192.168.0.1'

That’s it. Albatross has more features, such as block devices, multiple bridges (for management, private networks, …), restart on certain exit codes, assignment to a specific CPU. It also has remote command execution and resource limits (you can allow your friends to execute U unikernels with M MB memory and B MB block devices accessing your bridges A and B). There is a daemon to collect metrics and report them to Telegraf (to push them into Influx and view in nice Grafana dashboards). MirageOS unikernels also support IPv6, you’re not limited to legacy IP.

You can also use albatross-client-local update to ensure you’re running the latest unikernel - it checks https://builds.robur.coop for the job and suggests to update if there is a newer binary available.

For someone who wants to build and run MirageOS unikernels

The fundamental tools for building in a reproducible way are orb and builder. On some distributions we provide binary packages (orb, builder) that you can use. On other distributions you’ll need to bootstrap them from source:

  • To build in a reproducible way, we developed orb, which is written in OCaml. It is an opam package available at GitHub - roburio/orb: check opam package reproductibility (installation via opam pin add orb https://github.com/roburio/orb.git) - once you have OCaml and opam installed.

  • To build builder, opam install builder is all you need to do. opam install builder-web will install the latest version of builder-web.

Setup builder

On Linux:

Builder provides a systemd service (builder) that you should start. There is as well a builder-worker service that executes the worker process in a docker container. Check the URLs and configuration in the systemd service files, if necessary modify it using systemctl edit --full builder-worker.service, and start it. The provided builder-worker.service script will build for Ubuntu 20.04 as of writing.

On FreeBSD:

For FreeBSD, rc scripts and an example jail.conf (and shell script to launch) are provided. Setting up a jail is documented in the README (using poudriere).

Setup builder-web

Builder-web needs an initial database, an initial user, and also has a service script. Use the builder-db migrate command to create an initial database, and builder-db user-add --unrestricted my_user to create a privileged user my_user. Setup your builder to use reproducible packages from builder-web and upload results there (by setting the --upload https://my_user:my_password@builds.robur.coop/upload).

Schedule an orb job

The command builder-client info should output the schedule, queues, and running builds. To schedule a daily build, run builder-client orb-build traceroute traceroute-hvt. This will create a new job named traceroute and pick up the job template (/etc/builder/orb-build.template.PLATFORM) and schedule that job to your worker in order to build the opam package traceroute-hvt.

We document the commands, you can always execute it with --help to see the man page.

Reproducing builds

From a build on https://builds.robur.coop, select an operating system and distribution that has been used for a build. Go to the specific build, and download the “system-packages” file – these are the exact versions of host system packages that were used during the build. Make sure they’re installed (version variance may lead to non-reproducibility - orb and builder are not needed for a manual rebuild).

Download the build-environment file, which contains all environment variables that were set during the build. Set these, and only these, in your shell.

Install opam (at least in version 2.1). Then, download the opam-switch file - which includes all opam files and dependencies (including the OCaml compiler). Execute opam switch import opam-switch --switch reproduced-unikernel which will create a fresh opam switch where it will install the unikernel. This will be located in opam switch prefix/bin/unikernel.hvt.

Core software components in more detail

orb

The Opam Reproducible Builder uses the opam libraries to conduct a build of an opam package using any opam repositories. It collects system packages, environment variables, and a full and frozen opam switch export. These artifacts contain the build information and can be used to reproduce the exact same binary.

builder

Builder is a suite of three executables: builder-server, builder-worker and builder-client. Together they periodically run scheduled jobs which execute orb, collecting build artifacts and information used for reproducing the build. The builder-worker is executed in a container or jailed environment, and communicates via TCP with the builder-server. The result of the build can be uploaded to builder-web or stored in the file system.

builder-web

Builder-web is a web interface for viewing and downloading builds and build artifacts created by builder jobs. The binary checksums can be viewed and the build inputs (opam packages, environment variables, system packages) can be compared across builds.

It uses dream with sqlite3 as backend database. The database schema evolved over time, we developed migration and rollback tooling to update our live database.

albatross

Albatross is an orchestration system for MirageOS unikernels. It manages system resources (tap interfaces, virtual block devices) that can be passed to the unikernels. It reads the console output of a unikernel and provides it via a TCP stream. It also has remote access via TLS, where apart from inspecting the running status also new unikernels can be uploaded. Albatross integrates with builder-web to look up running unikernels by their hash and optionally updating the unikernel binary.

solo5

Solo5 is the tender - the application that runs in the host system as a user process, consuming the system resources, and delegating them to the unikernel. This is a pretty small binary with a tiny API between host and unikernel. A great solo5 overview talk (FOSDEM 2019).

Future

We have enhancements and more features planned in the future. At the same time we are looking for feedback of the reproducible build and unikernel deployment system (with a security perspective, with a devops perspective, etc.). We are also keen to collaborate and would take new people on board.

  • Improving the web UI on https://builds.robur.coop/. If you’re interested, please get in touch, we have funding available.
  • Supporting more distributions: tell us your favourite distribution and how to build a package, then we can integrate that into our reproducible builds infrastructure.
  • Supporting spt - the sadboxed process tender - to run unikernels without a hypervisor.
  • Data analytics: which system packages updates or opam package releases result in variance of the binaries - did the release of an opam package increase or decrease the overall build times?
  • Functional and performance tests of the unikernels: for each different build, conduct basic functional testing, and performance test - to graph in the ouput. Also includes data analytics: did the release of an opam package increase or decrease the performance of unikernels?
  • Whole system performance analysis with memory profiling, and how to integrate this into a running unikernel.
  • MirageOS 4.0 support.
  • Metrics and logging collection and dynamic adjustment of metrics and log levels.
  • DNS resolver unikernel, still missing DNSSec support.

Interested? Get in touch with us via eMail to team at robur dot coop.

32 Likes

We (@rand @reynir @hannes) have several updates in this project:

  • binary package repositories (debian, ubuntu, FreeBSD) are now available with monotonic version numbering
  • the https://builds.robur.coop website has an updated look and feel, and includes dependency visualizations and “which module uses how much space” visualizations (a treemap, based on @Drup modulectomy)

Instructions on how to get started to setup unikernels at Robur Reproducible Builds

4 Likes

Ooooh, someone is using modulectomy! :blush:

It seems you have put quite a bit of work into it, if you want to help finish and publish the package, I would be delighted.

The other visualization of a tree of deps is pretty nice too. :slight_smile:

1 Like

I failed to find examples of treemap-style visualizations by randomly clicking around in builds.robur.coop, do you have a link?

The visualizations are found under the specific builds, e.g.: Job tlstunnel 2022-02-19 16:43:34Z

There are some upcoming blogposts on the developments (:

The background article by @rand is now online r7p5 | Builder-web visualizations at Robur

Some of the changes lead to a more coarse-grained overview (not on a function basis, but on a module / opam package basis) → less rectangles, but I suspect in certain scenarios you’d like to have fine-grained analysis. So maybe we should work on re-integrating our changes and provide a switch – whether coarse (&fast) or fine (&slow) is desired.

2 Likes

Having the library offer both would be pretty nice, I think.
You also made nice graphic changes and other code improvements which seem strictly beneficial.

In truth, in the original version, I made the fine-grained view because I could, not because I though it was absolutely necessary for visualization. If your version is faster and sufficient in practice, I’m rather happy with that. :wink: