MirageOS orchestration

mirageos

#1

The Problem

While experimenting with development and deployment of multiple Mirage OS unikernels across multiple hosts, I quickly discovered that manual orchestration is tedious and doesn’t scale.

ukvm

The recommended unikernel monitor for kvm.

  • Unique to each unikernel with no awareness of other unikernels.
  • Doesn’t have a configuration language - just CLI args.
  • Doesn’t create required resources (e.g. network devices).

libvirt

Virtualisation platform (e.g. KVM, Xen) management layer.

  • Tediously complex XML config language with few short-cuts.
  • Not new-user friendly - requires a lot of reading just to get a single VM (i.e. unikernel) going.
  • Does create configured resources if non-existent (e.g. network, pty, block devices).
  • Provides many great hypervisor and management features (e.g. networks, network filters, storage)
  • Only supports the virtio target.

Neither of these tools make it easy to build, configure and manage multiple unikernels across multiple hosts. Scripting XML or the CLI is obviously the wrong approach.

Existing Solutions

I searched around, hoping to find an existing Mirage OS orchestration system or something similar but didn’t get very far:

ocaml-libvirt

Out-of-date OCaml bindings to libvirt’s C library.

  • Major version(s) behind libvirt and doesn’t appear to be in use on any active projects.
  • If updated, it would provide the blocks to build a libvirt orchestrator.
  • Obviously anything written with these bindings won’t be able to be a MirageOS unikernel itself.

Unik

As far as the docs go, Unik provides [for Mirage OS] a basic build wrapper and yaml-based ukvm configuration.

  • Has hard dependencies on docker and VirtualBox which seems a bit restrictive.
  • Generic instance management functions are limiting.
  • Appears to be largely inactive.

An example unikernel platform

I guess what I’d like to see (or develop) is a MirageOS/unikernel platform that provides smart CI/CD and Kubernetes-like orchestration.

  1. User provides platform a git uri or tarball of unikernel dist
  2. Platform parses config.ml and determines unikernel requirements (e.g. devices, resources, access)
  3. Platform knows the capabilities of available hosts
  4. Platform allows the user to configure within the requirements and available capabilities (e.g. affinity, memory, count)
  5. Platform builds unikernel for the hypervisor/target supported by a suitable destination host
  6. Platform configures any environmental changes required by the unikernel or specified by the user (e.g. devices, network filters, DHCP leases)
  7. Platform deploys the unikernel to the destination host(s)
  8. Unikernel is now running and managed by platform (e.g. auto-restart, failover to other hosts, auto-discovery, etc)

I appreciate any suggestions, corrections or information on the topic. Thanks for reading.


#2

I’m aware of the existence of albatross by @hannes, which (from afar) looks like what you’re looking for?


#3

dear @bramford, I suffered from this as well (running atm 11 different virtual machines and shell-scripting deployment didn’t scale anymore).

From your example unikernel platform, albatross (please read https://hannes.nqsb.io/Posts/VMM first) is WIP and does not attempt to compile anything. Albatross is a family of processes running on the hypervisor:

  • vmmd (executing under root privileges at the moment, able to create and destroy virtual machines)
  • vmm_console keeps a ring buffer of the console output (which is redirected to a fifo) of each unikernel
  • vmm_log is a global event log
  • vmm_stats monitors getrusage, ifdata, and bhyvectl --get-stats

authentication is done via X.509 certificates and policies encoded in there, a client is not able to overprovision their running unikernels (atm the pages allocated by ukvm-bin are not accounted for). at the moment, vmmd also accepts tls communcation (but this will soon be a separate process). vmmd also creates tap devices (and attaches them to the specific bridges) dynamically. there’s no “restart on crash” feature implemented, neither a “persist this unikernel (i.e. start when albatross boots)”. the code is tested on FreeBSD only.

in the current security model, I try to protect anyone who has a valid CA certificate to access any L2 (bridge) they don’t have access to - or to starve host system resources (such as file descriptors, tap devices, … <- kernel memory), each unikernel is pinned to a single CPU (the set of available CPUs are written in the CA certificate, as is memory, bridges, and number of VMs). there’s still plenty of room for improvement, including L3 routing in the hypervisor https://vincent.bernat.im/en/blog/2018-l3-routing-hypervisor, and real block device support.

I’m convinced that albatross is very much WIP, and that we should have a good integration story of MirageOS unikernels with Kubenetes etc. as well. Once the TLS bits are moved into a separate process, I plan to investigate how to run albatross via startup scripts (and start a set of persistent unikernel upon boot).


#4

Ah sweet, yeah I did actually see albatross but wrongly dismissed it as an experimental hack project (I didn’t see/read the associated blog post). Having read the blog, I agree, it seems like [the beginnings of] what I’m looking for - I’ll get it running and see how it goes.

On the topic of Libvirt - I’m curious to know why this isn’t the obvious preference for unikernels given its prevalence and maturity as a VM manager? Is it due to ukvm's superior performance [over virtio] or are there other reasons?


#5

MirageOS (when using the xen target (-t xen)) emits a libvirt.xml file which should be usable by tools supporting libvirt (if there’s interest and maybe a PR ;), we could generate a similar file for ukvm and virtio targets). When I started developing albatross, libvirt looked too complex for my use case, and my goal with albatross is to have a minimalistic deployment system, customised for MirageOS unikernels.

The ukvm-bin from the solo5 project itself is minimal (some thousands of lines of code) and does not need any dependencies (unlike kvm which uses qemu), the direct ukvm (i.e. not virtio, which is supported by solo5 as well) target is tiny.

The main reason for me is reducing the trusted computing base, every line counts :wink:


#6

The main reason for me is reducing the trusted computing base, every line counts :wink:

Makes sense on the minimal trusted computing base.

I still see a strong case for Libvirt due to its pervasiveness. Few organisations will be able to run entirely on unikernels [in the forseeable future] and thus will need to manage/mix traditional VMs with unikernels. It is likely that such organisations would already have existing orchestration layers managing libvirt (e.g. Terraform, Openstack) that must also be used to manage unikernels.

if there’s interest and maybe a PR ;), we could generate a similar file for ukvm and virtio targets

I’ll have a look at this - I have some experience with libvirt and am also looking for ways to contribute to Mirage.