On interoperability between Rust and OCaml

ninjaaron · August 20, 2020, 10:04am

Has anyone tried this and blogged about it?

I haven’t delved into OCaml’s C interface enough to have much to say about this topic, and I’m just getting my feet wet in Rust. Rust is definitely the “spiritual offspring” of ML and company (especially OCaml), so there might be some synergy there.

However, my experience in the Python world is that even if you write your native extensions in another language, it’s all going through the C interface anyway, which limits your ability to leverage higher-level abstractions like those in Rust. The other problem is (of course) distribution. If you don’t want to distribute your work, or you don’t mind it only being accessible to people with the Rust compiler, that’s fine, but if your application targets a wider audience, C seems like a more sensible extension language.

Still, might be an interesting personal exercise to try and create some libraries to simplify OCaml and Rust interop—though I have to admit, OCaml is actually the main thing that’s keeping me from learning Rust properly! Every time I think I have a project for Rust, I end up being like, “nah. OCaml is fast enough and I like garbage collection and lists and TCO.”

c-cube · August 20, 2020, 2:18pm

There are some projects already:

dbuenzli · August 20, 2020, 3:57pm

As much as I can see how a good idea it would be to trade Rust for what we usually use C for, I still don’t understand how that is going to happen if they are not willing to commit to a stable ABI and make their eco-system usable without having to commit to their monolithic cargo tooling.

c-cube · August 20, 2020, 4:14pm

not willing to commit to a stable ABI

Well you can expose functions with the C ABI from rust, like the rest of
the world. I’m not sure what a richer ABI would be useful for?

and make their eco-system usable without having to commit to their monolithic cargo tooling.

It’s just like dune/opam, it makes libraries usable. You can still use
rustc directly if you want, it’s just as hard as using ocamlopt
directly. I think the cargo team is working on exporting build scripts
from cargo for these kinds of edge use cases though.

As far as I can tell, cargo makes rust a lot more usable than C or C++,
with their per-project collections of weird tooling, configure,
(gnu)make, meson, cmake, etc.

Chet_Murthy · August 20, 2020, 4:20pm

Two thoughts: (1) oy, it sure is a pity that Rust doesn’t seem to expose a stable ABI. I mean, sure, let it change from release-to-release, but without an ABI, you really can’t do callouts, sigh.
(2) There’s lots that’s possible with a richer ABI. One simple example (I’m sure there are more): Relying on the Google C++ Style Guid and C++ STL containers, I once wrote a C++ FFI IDL compiler, that was much more aggressive in dealing with complex arguments, and their memory-ownership. In an automatic way. It was all based on templates and C++ STL, so there would have been no way to do it for C. It’s true that it still went thru the C FFI, but the programmer never saw that, b/c the C code was automatically generated. It was also simple, with all the fancy code being in C++.

No idea if such a thing would be possible for Rust.

Also, it occurs to one, that given Rust’s (wonderful) committment to various forms of refcounting, maybe it could have a mode where it exposed its functions in a form that looked like C++. That would constitute a stable, well-understood ABI …

dbuenzli · August 20, 2020, 4:21pm

If you want to to be able to access Rust data structures from OCaml you somehow need to have guarantees about how they are going to be laid out in memory.

I have seen similar claims been made elsewhere, but could you point to documentation about this ? Last year I looked into that (admittedly not ver hard) I just couldn’t find anything (e.g. how do you do dependency analysis on sources, etc.).

Monolithic systems are always easier to design and use. What is more challenging is to make easy to use truly composable systems…

dbuenzli · August 20, 2020, 4:24pm

Also if Rust is going to be the new C, this is not going to be an edge case…

c-cube · August 20, 2020, 4:41pm

If you want to to be able to access Rust data structures from OCaml you somehow need to have guarantees about how they are going to be laid out in memory.

Other reprs - The Rustonomicon

To be fair, I haven’t explored this too much. Zach Shipko and others
have done a lot of things around rust ∩ OCaml and are far more
knowledgeable.

c-cube:

It’s just like dune/opam, it makes libraries usable. You can still use
rustc directly if you want, it’s just as hard as using ocamlopt
directly.

I have seen similar claims been made elsewhere, but could you point to documentation about this ? Last year I looked into that (admittedly not ver hard) I just couldn’t find anything (e.g. how do you do dependency analysis on sources, etc.).

I think if you want to build a whole project,it’s the (unstable) cargo build --build-plan, which outputs commands in json. Otherwise calling
directly rustc works, of course, if you do like C and perform
dependency analysis yourself.

github.com/rust-lang/cargo

Tracking issue for build plan generation (--build-plan)

opened 08:11PM - 27 May 18 UTC

jonas-schievink

Z-build-plan S-needs-design C-tracking-issue S-needs-team-input

Implemented in https://github.com/rust-lang/cargo/pull/5301, Cargo gained the ab…ility to produce a JSON build plan containing a list of commands that need to be executed to build a target. The feature is currently unstable and requires a nightly Cargo release as well as the `-Zunstable-options` command line argument. This issue tracks its eventual stabilization. This feature was originally requested in https://github.com/rust-lang/cargo/issues/3815 to facilitate Cargo's integration into other build systems.

Monolithic systems are always easier to design and use. What is more challenging is to make easy to use truly composable systems…

Right, but what truly composable systems do we currently have anyway?

Cargo is very composable in the rust world (and sometimes with embedding
C in rust), which is already better than most of C or C++ projects (I
imagine meson/cmake are a step towards that, but certainly not as easy
as cargo).

I’m not saying rust is perfect, but there’s already a bunch of people
who have worked hard on integrating rust with C or C++ (in both ways);
eg. in firefox, librsvg, …

bluddy · August 20, 2020, 4:42pm

Strict aliasing was started by C, AFAIK, and it’s the ‘feature’ from hell. It literally made existing code crash in the most arbitrary manner costing me about a month of debug time. There is an massive number of programs out there that now suffer from random bugs due to the adoption of strict aliasing by default. It’s no wonder Linus turned it off for Linux.

Regardless, I didn’t know about these things you’ve pointed to. The surface area of C++ is now so vast you can easily miss large chunks of it, and it’s all advancing way too fast for casual programmers to keep up.

The placement new thing is obnoxious for primitive types. I understand where they were going with this, but they clearly didn’t think it through enough. It does make sense for more complex objects though, which need initialization. Placement new is clearly a hack that was added to the language to support more efficient use-cases (which C++ feels it has to own as it must be the king of efficiency), and adding that hack causes many other issues, requiring even more hacks.

dbuenzli · August 20, 2020, 4:57pm

So it’s composable in a closed world assumption where you use only Rust. As far as my own definition of composable system is concerned this is not at composable system, that’s a monolithic system where everything has to be in Rust.

FWIW I think the whole trend of each language reinventing its own package manager (and I include opam into this) and integrated language specific build system is a wrong track and a waste of time as far as building software is concerned.

Especially if you are a niche language or want to become the new low level interface it seems to be a good idea to invest into making your tools easy to interoperate so that creating systems that may include some components written in your language are easy to build and integrate in generic build systems (and I’m not even talking about runtime language interoperability here) and package managers.

Chet_Murthy · August 20, 2020, 4:58pm

Nope. FORTRAN (at least): [link: Aliasing (computing) - Wikipedia ]

In Fortran, procedure arguments and other variables may not alias each other (unless they are pointers or have the target attribute), and the compiler assumes they do not.

Back in the day (and we’re talking, pre-ANSI C) this was part of why FORTRAN code was so much faster than C. The original Numerical Recipes was in FORTRAN. Eventually people switched over to C, but it took quite a while. I wondered why that happened, given that NA types are so heavily into speed. Now I guess I understand: C adopted strict aliasing from FORTRAN.

c-cube · August 20, 2020, 5:14pm

Rust, and some embedded (vendored) C. Things like bindings to zlib, for example, give you a choice at compile time between using the system library, or using an optional vendored copy of the library in case you’d rather not depend on an environment where zlib is present. Again, what “composable” system do you have in mind? NixOS?

Yes, but which package manager / build system to use? I think it’s harsh to ask of a new language that it solves an unsolved social problem about the proliferation of package managers in addition to whatever technical improvements it already has (borrow-checker and such). I would very much like to get examples of such generic tools, hopefully not tied to a particular OS or OS distribution.

Personally I don’t see any viable way rust could have reused an hypothetical generic package manager (or even worse, a generic build system?). My experience is that cargo makes it much easier to compiler existing rust projects and write new ones, actually easier than in OCaml.

bluddy · August 20, 2020, 5:16pm

Right, but they had it from the beginning. This isn’t a feature that can be easily retrofitted to a language 20 years in. Even more obnoxious is gcc’s decision to make it the default.

Chet_Murthy · August 20, 2020, 5:35pm

[dons flame-retardant suit] You mean, like Dune ? [/takes off suit]

dbuenzli · August 20, 2020, 7:21pm

There are quite a few generic build systems out there. And for package managers (yes I think the two should be distinguished) Nix seems indeed to go into the right direction – wonderful PhD thesis, I recommend reading.

Fundamentally you can claim you install mushrooms rather than packages or libraries but in the end the mechanics of library versioning, install and setup for use is mostly language agnostic. It’s just a bunch of compiler datastructures dumped on the disk to be looked up by your compiler in some way.

Nowadays I personally need to be familiar with at least 4 different package systems that basically just resolve constraints, download stuff, run a few programs, copy files in dedicated directories and setup search paths. From a user experience point of view it’s ridiculous. And these systems are not even able to interoperate meaningfully.

Of course nobody is going to agree on a common system because every language specific community will want to be at the root of the dependency chain. Narrow minds. (My contribution to the debate is to write it in λ-calculus + system primitives so that each language can easily compile or interpret it itself).

I didn’t ask to solve the problem I asked to provide the tools so that the social problem can be solved. You can perfectly provide your own integrated tooling built on top of the basic tools needed for integration in more generic systems.

That’s just another way of saying “avoid inversion of control”, “build libraries not frameworks”, etc.

Chet_Murthy · August 20, 2020, 7:48pm

None of these package-managers (Nix, opam, cargo) that privilege the source, and assume as a default, that packages are built from source, are ever going to be suitable for production-systems use, except at the most boutique shops that can afford the sysadmin team to minister to them.

For real production systems, you need a few things:

package names, versions, and dependency-graphs must be expressible and comprehensible to humans
at install-time, everything is binary and no compilers are needed; heck, in many places compilers are forbidden from production systems b/c of the obvious security exposure. [remember when Jobs wouldn’t allow a third-party bytecode interpreter onto the iPhone, because of the security exposure?]
binary-checksum-level reproducibility of completed installation package-sets is nearly non-negotiable.

I mean, when a company whose business -exists- on software, upgrades their hardware, they usually have to re-validate everything, even if the new hardware is just the next version of the same chip from the same vendor.

bluddy · August 20, 2020, 7:53pm

I think a lot of this is also a matter of how going viral affects programming language communities. Think of how much work was needed to unify OCamlers around opam/dune vs the situation that existed before. Now imagine that OCaml wanted to open dune up to haskell, for example. The chance of gaining traction in the haskell community is minimal, even if some haskellers would adopt opam: they have good competing choices used by too many people. The same applies to every other language – in fact, it would perhaps be a good idea to offer opam to languages that don’t have an agreed-upon package manager, such as C++, since in that case there’s a much better chance of succeeding. Of course, once you do, you have to be fully open to the particular needs and wants of that community, and that’s a lot of work, and perhaps that’s the greatest barrier: every language has its particular needs and desires, and unless you have a very well funded organization that can service those needs for all language communities (and figure out how to integrate those features together into one product in a coherent way), you will inevitably be under-providing for some community. That’s not being narrow-minded – that’s just reality.

Chet_Murthy · August 20, 2020, 7:59pm

It would be -much- more tractable if Dune were to generate (reasonably idiomatic) Makefiles, that were then used to execute the actual build. Then “composability” could be achieved at the level of those Makefiles. It would also allow a level of debuggability that is sorely lacking today. [yes, I routinely trawl thru the “log” file that Dune generates, to figure out what’s going on; it might as well be binary, it’s so incomprehensibly big.]

To that end, it is truly regrettable that Dune doesn’t use findlib packages as its unit of modularity.

c-cube · August 20, 2020, 8:26pm

Why generate makefiles when you can generate ninja files?

I agree that if most major build tools were to export rules in such a low level builder, we could have better composability. In the meantime, I’ll keep considering cargo to be as nice as it gets for building and managing dependencies, and dune as nice as it gets for building.

Nix is nice on the theoretical side, but in practice, it’s not per-project, is it? Last time I looked it kind of wanted a global installation rather than just sit in ~ or in $current_project_dir.

Chet_Murthy · August 20, 2020, 8:32pm

ninja files are meant only for -executing-. They’re not meant for reading, or writing by hand.
the goal of “idiomatic” Makefiles is that it be easier to -glue- them together with other bits of Make. For instance, today if you find it painful-to-impossible to compiler your C++ code to binary using dune, and you need to do so for a mixed ocaml/C++ project, you’re stuck. But if you could just tell dune to “use this .o file” and then in in Makefile language describe how to build it, you’d be free and clear.

Topic		Replies	Views
OCaml for building shared libraries: how are the ergonomics and performance? Learning	4	679	March 25, 2024
Oxidizing OCaml, and a new opam switch Ecosystem type-system , ocaml	4	3010	March 24, 2024
Taking Inventory of the OCaml Ecosystem on OCaml.org Ecosystem user-feedback , ocamlorg	14	1065	June 2, 2023
Derive-ocaml: a rust crate to help with ocaml FFI Community announce	4	2179	August 1, 2018
Application-specific Improvements to the Ecosystem Community	52	2920	August 12, 2022

On interoperability between Rust and OCaml

Related topics