I am delighted to present the first experimental release of Esperanto. This project is a new OCaml toolchain that creates binaries compiled with the Cosmopolitan C library and linked with the αcτµαlly pδrταblε εxεcµταblε link script. The binary produced is then portable to different platforms:
The main objective of Esperanto is to provide a toolchain capable of producing a portable binary from an existing project. This would allow to finally be able to distribute software for all these platforms without having to:
- manage multiple platforms orthogonally, the Cosmopolitan C library offers you the POSIX API for all platforms (including Windows)
- Produce several versions of the same software for each platform. Only the binary is needed to run on all platforms
Cosmopolitan does not however produce a binary with a multi-platform assembler. At this stage, our distribution only supports the x86_64
assembler (the most common one) but we are working on the possibility to produce a binary with different assemblers.
I would like to give special thanks to Justine, the author of the Cosmopolitan project (to develop redbean, a small portable HTTP server) for her excellent work.
A toolchain
In OCaml, the “toolchain” principle allows the existence of several compilers within an OPAM switch and to choose one of them when it comes to cross-compiling a project. This principle, even though it is not clearly defined and even though its use remains very limited, exists through the ocamlfind
tool.
You can find these toolchains in your switch:
$ ls $(opam var lib)/findlib.conf.d/
esperanto.conf solo5.conf
From our experience with Mirage as well as the work done in dune
regarding cross-compilation, the choice to propose a new toolchain in order to allow cross-compilation of projects with OPAM is both a historical choice but also the most relevant one in our opinion1.
Why we need to cross-compile?
The term cross-compilation can be misunderstood if we only consider the question of the assembler issued by the compiler (does it match the host assembler or not). In our case, cross-compilation is a broader term that implies the use of external artefacts to the compiler that are different from the default and the use of compiler options that must be used throughout the production of the final binary.
In other words, even though we are emitting the same assembler, we are doing so in a different “context” which requires the definition of a new toolchain which includes our artefacts and compiler options.
One of these artefacts is of course the C library used by the compiler which will then be systematically used by the runtime caml, the well named libasmrun.a
. This is why, for example, there is a version of OCaml with musl. So there must be a version of OCaml with Cosmopolitan.
This new toolchain also allows you to include the necessary options for compiling C files because, yes, you can compile a C file with ocamlopt
.
In order to provide a coherent workflow for a project, we need to provide not only a libasmrun.a
compiled with our Cosmopolitan C library but also an OCaml compiler capable of invoking the C compiler with the right options required by Cosmopolitan.
Finally, we also need to describe in this toolchain how to link the object files together to actually produce a portable binary using the APE script.
A simple example with this new toolchain
Installing Esperanto is very easy with OPAM. It will install the cross-compiler and the necessary files so that ocamlfind
/dune
can recognise this new toolchain:
$ opam install esperanto
Finally, let’s try to produce a simple binary that displays “Hello World!”:
$ cat >main.ml <<EOF
let () = print_endline "Hello World!"
EOF
$ ocamlfind -toolchain esperanto opt main.ml
$ objcopy -S -O binary a.out
$ file a.out
a.out: DOS/MBR boot sector
The binary produced can already be executed. However, there are still some issues that have been fixed since then but which are probably not yet integrated in your system. They concern zsh
and binfmt_misc
in particular.
The first problem with zsh
is that it does not recognise the binary correctly. This problem has been fixed in the latest version of zsh.5.9.0
.
$ zsh --version
zsh 5.8.1
$ zsh
$ ./a.out
zsh: exec format error: ./a.out
The second problem concerns binfmt_misc
which intervenes upstream at the execution of your programs in order to choose how to execute them. In this case, binfmt_misc
recognises Cosmopolitan binaries as Windows software by default.
Here too, a solution is available and described by the author of Cosmopolitan here: APE loader
Execution & Assimilation
If you are not concerned by the above problems, you can simply run the program:
$ ./a.out
Hello World!
There is a final solution that requires a little explanation of what αcτµαlly pδrταblε εxεcµταblε is. Indeed, the latter makes it possible to create a polyglot binary whose first point of entry is not your program but a small program which tries to recognize on which platform the binary tries to run.
After this recognition, this little program will “inject” values corresponding to the platform in which you try to run your program in order to finally let Cosmopolitan manage the translation between its interface and the real POSIX interface that your system offers.
Of course, this step has a cost as it adds an indirection between what your program wants to do and what is available on the system running your program. However, APE offers a very special option that allows the program to be assimilated to the platform in which it wants to run.
$ file a.out
a.out: DOS/MBR boot sector
$ sh -c "./a.out --assimilate"
$ file a.out
a.out: ELF 64-bit LSB executable, x86-64
$ ./a.out
Hello World!
This option makes your application truly native to the platform in which you run it. This means above all that the program is no longer portable.
Esperanto, dune
& opam monorepo
The dune
software also incorporates this toolchain idea using the -x
option. More pragmatically, it is possible to define a new dune context to use Esperanto as a compilation toolchain.
However, the original aim of Esperanto is to produce a portable binary. This implies, among other things, that it should not depend on remaining artefacts in order to run and, in this sense, the compilation of your project should be a static compilation. This means that all dependencies of your project must be available to compile in the same context as your project.
Again, this is particularly necessary if any of your dependencies include C files, so they need to be compiled in some way.
This is where opam monorepo
comes in, it will simply “vendor” your dependencies into a “duniverse” folder. Here are the steps needed to compile a project with Esperanto. We’ll take decompress
as an example which produces a binary that can compress/decompress documents:
$ git clone https://github.com/mirage/decompress
$ cd decompress
$ cat >>bin/dune <<EOF
(rule
(target decompress.com)
(enabled_if
(= %{context_name} esperanto))
(mode promote)
(deps decompress.exe)
(action (run objcopy -S -O binary %{deps} %{target})))
EOF
$ cat >dune-workspace <<EOF
(lang dune 2.0)
(context (default))
(context
(default
(name esperanto)
(toolchain esperanto)
(merlin)
(host default)))
$ opam monorepo lock --build-only
$ opam monorepo pull
$ dune build bin/decompress.com
$ sh -c "echo 'Hello World' | ./bin/decompress.com -d | ./bin/decompress.com"
Hello World
Issues
Apart from the outcomes described above, however, the Esperanto toolchain is not complete. Indeed, the OCaml distribution gives several libraries such as unix.cmxa
and threads.cmxa
. A little work has been done to make the former available. The second one is however unavailable for the moment since Cosmopolitan only partially implements pthread
.
However, it seems that the author of Cosmopolotian wants to implement the rest of the pthread
API which will then allow us to provide support for threads.cmxa
and OCaml 5.
This of course makes support for the projects more limited than we imagined (and that’s why this release is experimental) however, an effort has already been made to lwt into Cosmopolitan’s hypothetical future support for pthread
.
Future
As explained above, support for threads.cmxa
and OCaml 5 remains the priority. however, an effort has already been made to support Lwt via Cosmopolitan’s hypothetical future support for pthread
.
However, it is possible that Cosmopolitan could become a target for the MirageOS project in the same way as Solo5 (or our recent experiment on Raspberry Pi 4).
In this sense, we will surely propose an integration in MirageOS so that projects can both produce unikernels with Solo5 or portable binaries with Cosmopolitan.
1: However, the question remains open at several levels, that of the compiler, that of OPAM and of course that of dune
. It is clear that the current situation is not the best in terms of what we need to do to produce such a cross-compiler. Only the feedback from Solo5 (which requires cross-compilation) allows us to say that it is surely the right choice for what we want to offer.
Conclusion
We hope that this project will facilitate the distribution of software. You can read a more technical article about our work here. Finally, I would like to thank robur.io (an association you can help) for allowing me to do this project.
EDIT: The author of Cosmopolitan just released Cosmopolitan with pthread
support. So we will definitely try to improve our distribution to include OCaml with threads.cmxa
support and move forward with OCaml 5!