My Thoughts on OCaml vs Haskell/Rust in 2023

I think the best approximation is, indeed, the pattern

let fun_name : typ -> typ -> typ =
fun arg0 arg1 -> ret

include struct goes way overboard for the marginal benefit of spelling the function name twice

2 Likes

There is a nice library for GC in Rust, so this is not a clear advantage of OCaml or Haskell anymore: GitHub - kyren/gc-arena: Experimental system for rust garbage collection.

This is marked as experimental…

It’s a nice start, and for sure, worth watching. But … it’s mark-and-sweep, and one thing we’ve learned from the decades of GCed languages, is that it’s worth using generational GC with at least a nursery. In Rust, using non-GCed types is a workable substitute, but … I doubt that it’s enough: the extra work of deciding whether a type is supposed to be stack-allocated or GCable is nontrivial. For example, one of the original great advantages of OCaml was that closures were stack-allocated, and then moved to the heap only when a pointer to a closure was returned from a function-call [and then in many cases, code that appeared to return closures in fact didn’t do so, due to evaluation-order]. Imagine if you have to specify whether the closure is on the stack or the heap?

I’m not saying that this is worthless; rather, that it’s early days.

2 Likes

It’s used in big projects already, like Ruffle.rs: https://github.com/ruffle-rs/ruffle/tree/master/ruffle_gc_arena

1 Like

Rust is built with an assumption that there is no GC. Everything in the language and compiler design is centred around that. Can we have some limited GC using libraries? yes, sure we can and Ruffle/other libraries are some good examples.

This is not unique to Rust. In C/C++ for instance we have the famous Boehm garbage collector - Wikipedia collector.

Good GC involves intricately designing your language semantics and internal data structures with garbage collection in mind. It cannot be bolted on with a library if you want performance, generality, lack of complexity, lack of weird edge cases etc.

TL;DR Regardless of what library you show me today or in the future in Rust, I just don’t think Rust GC can approach anything OCaml/Java/C# etc. offer.

9 Likes

Newer anchor OCaml libraries like Eio use objects/classes quite extensively

Just fyi, Eio has recently stopped using objects/classes and switched to polymorphic variants.

6 Likes

Really? That’s a pretty wild journey they went on- is there a blog post about this change somewhere?

1 Like

More details here Replace objects with variants by talex5 · Pull Request #553 · ocaml-multicore/eio · GitHub

9 Likes

It might be that the relevant comment got moved somewhere else between when you posted this and now, but the only real comment there now is an unanswered question about why Jane Street wanted EIO to use (polymorphic) variants instead of objects.

1 Like

There are other ways to get similar notational convenience for the types of numeric problems you are interested in using typeclasses for. For example, Julia uses multimethods to great effect. Since notation is really what we need for this use case, it might be worth considering whether there’s another way to achieve the same goal. (E.g. You could probably mimic Julia’s multimethod dispatch logic with polymorphic variants. But you’d probably need a ppx or some syntax tweaks to make it not hideous. Still, those are easy to make than a whole new language feature with uncertain type-system implications. Hell, given how little I know about the inner workings of the object system, for all I know, multimethods could be a feasible language extension.)

Also, long-term, I don’t think type-classes are “enough”. Ultimately, we want to be able to use the type system to prevent errors. Otherwise, what’s the point? But coming up with a way to have rank polymorphic functions that give you the power of the APL family of languages while being statically type-checked and efficiently compiled is a huge undertaking.

Similarly, it would be great if someone could find a way to encode the relevant information needed to verify floating point intensive code into the type system so that you didn’t need to use Coq to prove important properties.

Adding whatever we need in order to have typed units would be good too. (I forget why F# can support them but OCaml can’t.)

I could go on. But the point is that there is room for OCaml to innovate here. “We specifically need modular implicits” is too limiting.

You might be right, that some other means would suffice. My own formative experience in this regard was when I hacked on the sprs (sparse matrix) package in Rust, and saw how concise code could be thru the use of Rust traits. It sure appears to me that those things that can be done in Rust, can also be done in OCaml+modular implicits. Maybe other means would suffice, and if you have examples of how to adapt Julia’s multimethods, or how to use PPX extensions to achieve similar conciseness, I’d love to learn of them.

If you can show some motivating examples that show a way forward, I’d be happy to pick up that ball and run with it: I just don’t quite see how any of the possibilities you list could work, whereas I can see pretty clearly how a Rust-like trait mechanism would work.

The initial comment of the PR explains the motivation and gives an overview of the changes. I was referring to that when I said ‘more details’.

On typeclasses versus multimethods: I’m not sure about the compared difficulty of adding either feature to OCaml but, as a user of OCaml I’d much rather have typeclasses than multimethods.

  • Multimethods are limited to the object system, which is far from being pervasive in OCaml.
  • Multimethods are dynamic, while the problems you want to tackle with typeclasses are static. Not only you pay an unnecessary run-time penalty, it also raises the question of composition of independent libraries (may module B hijack a call to f located in the independently-compiled module A?), and it imposes to resolve ambiguities (next point).
  • In case of ambiguities (because of subtyping), the resolution mechanism is obscure and unstable (what if you change the order of arguments in refactoring). Because it is dynamic, failing is not acceptable. By contrast, with typeclasses it’s dead simple: if there are several possible instances, just fail at compile-time.

This is not that simple, since the compiler might not know at compile time all the possible instances (due to separate compilation). They solved this issue in Rust by being extremely restrictive, that is, you can only define an instance in the same place as the type or the same place as the class (a trait, in Rust lingo). (Both places are necessarily visible at compile time.) Anywhere else is a hard error. As a consequence, if a user gets a type from library A and a class from library B, they are not allowed to write the corresponding instance, since it would live neither in A nor in B. You have to convince the developers of either A or B to do it for you.

2 Likes

My apologies, I wasn’t aware of that. But that confuses me: my belief with typeclasses is that the instances are elaborated during the typing phase, with instances available in then-known interfaces (cmi); not later during the linking phase. Binding a call to f in a module A to code found in an independent module B, unknown to A until linking, sounds a bit like “hijacking” to me. Am I wrong to expect that of a hypothetical OCaml-with-typeclasses, or else, what am I misunderstanding?

Is it the same distinction that @jhw is describing in this reply in another thread, where Rust would achieve canonical coherence, with the drastic restrictions you mentioned, while my expectation about OCaml-with-typeclasses would correspond to non-ambiguous coherence?

I suppose so, though I don’t think I ever heard the adjective “canonical” in the context of Rust’s trait resolution.

Also, just in case the restrictions I mentioned seems too drastic, let me expand a bit. When library A provides you with a type and library B provides you with a trait, the usual solution is to define a trivial wrapper type, so that you can implement the trait on this new type rather than on the original one. The downside is that you will have to insert (trivial) conversions to the original type whenever you need to call into library A.

1 Like

I think you can implement Deref and then it’s automatic, no?

No, the Deref trait corresponds to the dereference operation, i.e., *p, and is meant to implement smart pointers. So, this is unrelated. Perhaps you had in mind the autodereference feature of Rust, but again, this is about dereference and thus unrelated.

What I was alluding to is the fact that, if you have a reference to the old type and need to turn it into a reference to the new type, there is no easy way to do it without unsafe code (similar to the use of Obj.magic in OCaml). In fact, this is the kind of stuff where you need to write a whole preprocessor to make it seamless: GitHub - dtolnay/ref-cast: Safely cast &T to &U where the struct U contains a single field of type T.