My Thoughts on OCaml vs Haskell/Rust in 2023

It’s discussed at the bottom of the page on this link:

https://doc.rust-lang.org/book/ch19-03-advanced-traits.html?#using-the-newtype-pattern-to-implement-external-traits-on-external-types

To quote:

If we wanted the new type to have every method the inner type has, implementing the Deref trait (discussed in Chapter 15 in the “Treating Smart Pointers Like Regular References with the Deref Trait” section) on the Wrapper to return the inner type would be a solution.

Have I misunderstood your original comment?

The Deref trait only helps you in one specific case (going from the new type to the old type, which is just syntactic sugar for &x.0). But what about all the other cases? So, more concretely, there is a library A that provides some type A and a library B that provides a trait B. You want to tie them together, so you define a new type C which is a trivial wrapper over A, so that you are allowed to implement the trait B over C. So far, so good. Now, suppose a function of library A returns an array of elements of type A (i.e., Vec<A>) and you want to call a function of library B that expects a slice of elements satisfying B (i.e., &[T] where T:B). To do so, you need to turn a slice of A into a slice of C (i.e., &[A] -> &[C]). This is a real transmutation and no amount of Deref will help you do it in a safe way.

My main point was that modular implicits are a very long term research project with no prospect of landing anytime soon. So, if you actually need to do work in this area now, then you have to go with the language that we have. And that means finding a different way to map this abstraction. Maybe you need to do some flamba2 work or whatever to make it performant. And maybe doing it will surface other issues and other options.

But, for data science and linear algebra work, the main issue is notation. The other features of modular implicits are less important because the type-level stuff doesn’t buy you much in terms of safety here. The true types needed are dependent types. And last I checked, for data frames, having a correctly specified dependently typed interface was still a research project. Any type-level thing you do short of that is going to be some approximation and you are still going to have run-time checks and decisions.

I was using multimethods as an example of an alternative solution that exists. Similarly, C++'s Eigen library is extremely fast and makes due without even that. Array programming languages get by despite being unityped. I haven’t seen a good argument, beyond notation, that shows why typeclasses / modular implicits is the best approximation of the relevant types to the exclusion of all other approaches.

The point was simply that there are other approaches that other languages have taken that are being excluded by assumption here. Some of them might be easier to encode into OCaml’s current type system in a way that provides more notational convenience that you currently get with Owl.

I gave almost no thought to the details of how to implement multimethods using polymorphic variants, but the gist of the idea was to hand code (or ppx) the dispatch function by matching on all the cases for all of the arguments. So you’d have one normal function that “figures out” which actual function it should call based on the types of the arguments.

The goal was simply to be illustrative. I don’t use Owl enough to actually know what the correct solution is. Or where the roughest parts of the usability are. But I do know that everyone insisting we need modular implicits isn’t going to make the research happen any faster or fix any actual problems people are having right now today.

In fact, maybe what Owl does now is actually fine and we just need some more elaborate ppx stuff to make the syntax better. (I have yet to see a compelling illustration of why this isn’t possible.)

Worst case scenario, modules can do everything typeclasses can do. They are just syntactically heavier. So you could, if you had to, write a DSL for scientific code that turned “typeclass” specifications into a bunch of module definitions and then made your code explicit about what was going on. Doing it as a DSL would probably be easier than having to do it in the general case as part of the language, especially since scientific code is fairly straightforward by comparison to the general case.

But again, this is something the people who need this solution should experiment with. That or they need to convince someone capable of implementing modular implicits to make it a real priority instead of a “someday maybe” thing.

But nothing is going to change if everyone just holds their breath for a research feature that may never land. And if that feature is really and truly necessary, and none of the alternatives are workable, then, at least for the time being, OCaml just isn’t an appropriate language for that kind of code.

2 Likes

I have a comment about a different but related issue. I think it is a mistake to believe that OCaml can attract lots of python devs (doing science or data science) if it had better math notation. The large majority of those devs want a dynamically typed language designed for quick scripting.

In addition they have very little motivation to move to a new language since the python ecosystem is working very well for them. Even Julia, which has dynamic typing and notation designed for math, hasn’t been able to attract very many of them.

A smallish minority of those python devs may actually want a functional language and/or static type checking. And it would be great to attract those devs to OCaml. But I think it is a mistake to make large changes to OCaml that are specifically targeted at that small segment. It would be better to consider a much broader mix of OCaml use cases when designing and prioritizing changes.

I’m not saying that modular implicits are designed for math notation. Only that I notice a big emphasis on support for math notation in discussions here, when discussing priorities.

6 Likes

There is a significant group of people who use Rust to speed up Python, and specifically in quantum computing. And specifically in areas using matrix-math, replacing numpy/scipy with Rust. I worked in this area, and while you can say that numpy/scipy offer “mathematical notation”, I would argue that what they offer is “less-heavy type-ful code”. And that’s what Rust’s traits offer, which is why it’s taking off where the idea of writing the same matrix-math code in OCaml would be painful beyond imagination.

2 Likes

There is a significant group of people who use Rust to speed up Python

I agree, but that set of people did not move from using python to rust, they moved from C/C++ to rust, when writing libraries for use in python. I was referring to the discussions here about making OCaml suitable for science and data science directly, where it is correctly stated that the OCaml syntax for math is lacking. (Sorry if I changed the subject of this thread, I may have jumped in at the wrong point. There are several threads where this has come up.)

It may be possible that OCaml could be used to write math oriented libraries for python, rather than writing the libraries in rust or C/C++, and that OCaml could be improved for this use case. But I think the priority of this should be relatively low, since OCaml’s strength is not in creating standalone compute engines. Or do you really think this use case is an important one?

The main use case for heavy math in OCaml, that I can imagine, is when adding data science based features to OCaml apps. This is going to happen for all kinds of apps of course. And if the current math notation is so horrible that it can’t be done, then maybe OCaml needs modular implicits for that use case.

My only point is to be careful to identify the specific use cases being targeted, and not imagine they are larger than they really are.

Uh, actually, no. They started with piles of Python numpy/scipy code, found it was too slow, and sure, they tried C++, but that was too hard. So they went to Rust. This is the specific story of how Rust is getting used in quantum computing, but I suspect it’s happening all over numpy/scipy users.

1 Like

two thoughts:

  1. saying that they moved from C++ to Rust isn’t quite accurate: they found C++ too hard, so they tried Rust, and it wasn’t so hard, and just as fast. But this came along with an explosion of people learning and using Rust for writing low-level math code.

I’m one of those people, and yet I know C++ really well. Am I moving “from C++ to Rust”? I don’t think so, don’t feel like it. I feel like I’m finding that Rust is a great place to write sparse-matrix code, period.

And this Rust sprs sparse-matrix package is a great example. It isn’t “math code”. It isn’t written in “mathematical notation”. It’s written in Rust, but with very few types in the code, b/c type-inference and traits infer them all. Calling the implementation of sprs “mathematical code” is just wrong: it’s the guts of a math library, and it’s compact and comprehensible because it’s written using modular implicits/traits.

Now: maybe you think nobody will use OCaml for writing math libraries and matrix-math codes. But nobody would have used Python for it, if somebody hadn’t come along with numpy. So maybe you’re saying that you just don’t care about matrix-math. And sure, that’s a fine position to take.

I guess.

2 Likes

Today they use Rust because it’s easier than C++ and performs similarly, tomorrow they will discover Mojo is easier than Rust and performs similarly.

It’s been forever since I’ve seen it, but I doubt your prediction. Rust is a systems-programming language, a replacement for C++. And it does so while feeling like programming OCaml. These numpy/scipy programmers use it b/c they like the value of strong typing: they just don’t want to pay the overhead of a zillion repetitions of type/module-names.

ETA: once upon a time, that would have resonated with the ML community.

And Mojo bills itself as an AI programming language with the ease of use of Python and the performance of C: Mojo 🔥: Programming language for all of AI (created by the guy who made Swift, btw)

1 Like

I’ve never programmmed Mojo, but from reading about it a good while ago, it seemed to be just that: a language with a compiler that understood and sped up various Python matrix-math idioms used heavily in AI.

But that’s not what Rust is: Rust is useful for doing matrix-math because it has traits and is generally as efficient as C++.