I once saw a Java based ECN (that is, a securities exchange) that had been written to essentially never allocate ever. It worked and had high performance with no garbage collections, but the code style was very peculiar as a result of the constraints. I think I’d rather synthesize imperative code that didn’t allocate than have to write it by hand. I’m not sure what OCaml code that operated on the same principles would look like but it would be interesting to see.
There are two kinds of allocation worth avoiding where possible in performance critical sections OCaml programs:
-
objects allocated directly in the major heap, i.e those with mutable fields or those not small enough for the minor heap. Avoid these by using non-mutable data structures for frequently allocated objects.
-
minor heap objects unnecessarily allocated at runtime when they could be allocated at compile time (and the compiler has no way to know that). Avoid these by using flambda and the profiler.
At least that’s what my intuition tells. Me corrections and additions welcome.
OCaml is surprisingly good about letting you switch in and out of this style, on a module by module basis.
It’s not quite a zero-alloc application, but Incremental is a good example of a library that has been heavily optimized to avoid allocation, and does a few tricks using the OCaml’s unsafe APIs to make it possible.
Rust, for what it’s worth, also requires you to use unsafe APIs to do the full range of things that are required in a high-performance system that makes good use of the hardware.
y
Looking at incremental, I get the impression that it’s mostly about using imperative code and avoiding functional features, which is… ok, I guess. It makes sense to do this for high performance libraries, but I wouldn’t want to do it for an entire codebase. You could use OCaml as an OOP language as well, but that’s not something I’d ever want to do, either.
It would be nice to have a way to keep track of code patterns that allocate minimally or not at all with flambda as it advances.
Also, this makes me think we really want to work on flambda inter-operating with non-flambda code, so that performance-critical libraries can compile with flambda and still allow other code to benefit from faster compilation.
Re: low level programming in otherwise garbage collected HLLs, one of the best tricks I’ve seen was done I think by Scheme48. The idea was to program in a fairly non-idiomatic subset of the language that could be mechanically compiled into a low level language (like C). You could then implement and debug in your favorite language, albeit a very stilted and unnatural version of it, and then convert in the last step into a very efficient language.
Programs written in systems like YNot and Bedrock can also be converted pretty mechanically into low level languages while still retaining their formally verified properties, while still being worked on and debugged in a high level functional system.
The thing incremental is doing is fundamentally imperative. It’s implementing the runtime for a functional-style DSL, which involves manipulating an imperative graph.
And I’m not embarrassed about writing imperative code. OCaml is an excellent imperative (non-OO) language, and when you want maximum performance, that’s often how you need to write your code. One of OCaml’s strengths is that it’s effective in multiple modes.
y
Indeed. One frustration I have had with pure functional languages was that when a stateful solution is clearly less painful, there is no escape from the purity. OCaml provides a nice balance.
https://github.com/fstarlang/kremlin implements a similar mechanism in a industrial strength way. I think some recent version of Mozilla Firefox is using tls lib developed in it. Basically you can program in a high level functional programming with proof and verification and then target c and thus avoid gc. Of course ocaml plays a star role in enabling this.
The decisive advantage: OCaml is a French technology made at INRIA.
You might find this article interesting – it explains how to profile heap allocations and how to rewrite code in mostly stack-friendly style:
My two cents :
Ocaml strength:
-
Garbage collected functional language, easy to write and fast prototyping to MVP. Ocaml is simple and has low cognitive overhead. In my opinion, rust is much much harder. It’s also much easier to refactor ocaml code as it’s easier to abstract things (less boilerplate) and avoid leaky implementation details. (better in rust with impl trait now i think)
-
The module system.
Functors and module are a very good tool for large scale polymorphism. It allows you to build your code in small independent bricks and compose them together. The signatures also make it easy to manage the level of internal details you want to exposed outside of the module.
In theory you can do the same in rust. But traits are not as powerful tool for this task as they are more focused at small scale isomorphism.
Sure you can have associated types and associated constants, but common private code and type must be declared outside in a module you need to wrap types in newtypes struct implement forwarding traits, and use visibility qualifiers which involves a lot of boilerplate. Same, in highly polymorphic code, trait where clauses can become a nightmare. -
Concurency/async programming:
ocaml libraries like Lwt/Async are easy to use, and fast enough. Rust introduced futures a while ago but the language was simply not ready for that. It ended up to be a real nightmare. And it’s still a total mess, at least until async/await is implemented (a bit better with impl traits now). Hopefully. The worst part ? It’s 10 times harder to write, and same order of per-core performance magnitude as ocaml equivalent. -
basic tooling
opam. Opam is a gem <3, cargo dependency management is a pain to deal with in comparison. I much prefer opam optimistic approach and the help of a powerful solver. Also it’s stupidly fast compared to cargo.
merlin is much more robust than the new Rust Language Server. And it’s fast. RLS take several seconds to react to a change. -
Stability : the language and lib author don’t follow this stupid idea that “it’s semver, i’m not 1.0 yet, i can break it every month”. It’s getting better in rust, but still ocaml ecosystem is more mature. I also like the slow pace of ocaml development. Big features take a long time, but when come, they feel polished and not half backed.
-
Expressive type system and features: GADT, polymorphic variants, ephemerons, first class modules, you don’t use all those stuff every day but they are here when you need them. In rust, you basically only have one tool: traits, without HKT. Also when you started pushing the trait system, like bounds on associated types or higher ranked lifetime in bounds, it can backfire hard (like compiler bugs, strange behavior, and really really really really hard to understand type errors). Also did i mention the where clause spaghetti yet ?
Rust strength :
-
type class (traits). Traits can be a pain sometimes but they are cool for some use-case.
-
performance. Ocaml is fast enough, and yes obviously you can trade a heck lot of money with it. But some basic crunching operation is more than 20 times faster with rust/llvm.
-
RAII + “linear” types (or whatever they are called) makes your code fool proof safe while tracking resources. say good bye to this ugly “with_foo” pattern. It is also much much harder to introduce data races than in ocaml in concurrent code.
-
No exceptions/open types : at first it makes you cry to have to deal with all of this explicitly, but say good bye to this catch all exception pattern which handled “Out_of_memory” exception or “Invalid_argument” 4 frames above and messed up everything.
-
No GC. I see there is discussion about latency for the GC being ok. While i agree in the general case, there are still two issues : 1) compaction. Compaction can destroy your worst case response time really hard, we are talking seconds here. 2) Throughput on large heaps: ocaml gc allocation policy simple, and that’s a strength, because it allows it to be fast and relies on compaction to clean the fragmentation mess. But on large heaps, it starts to fall short. When you go above ~70/80G, compaction can take half a minute to run, and if you try to run it less often, it will get worse: you will pay the price at each major gc in freelist walks.
Sure you can implement off-heap storage in ocaml, but in rust it just works fast out of the box using state of the art allocators. -
multi core. Write a 200loc simple http endpoint, launch it on 96 cores with SO_REUSEPORT and task pinning in 3 loc, scale up to 150K requests per second. Profit.
-
some killer packages like serde for serialization. Serde is a real gem, and good support of http. Also stdlib not from the 90’s
-
personal rant: easy to use build tool, using syntax readable by normal humans.
My conclusion:
use ocaml unless you really need rust.
That’s my general feeling, too. OCaml is a good every-day language for most things, Rust is probably the right choice for building things where performance really matters. I’d use OCaml to build a compiler or a system utility, but Rust to build a high performance microkernel. You probably will want to reach for OCaml a lot more often.
I agree with that sentiment, but it’s not that simple. OCaml does not have the libraries and the ecosystem to make that viable. In fact, I learned Rust because I wanted a language with advanced PL constructs that supported the technologies that I use on a daily basis. I would not have discovered Rust if OCaml had the libraries that I needed. I’m glad I learned Rust though. It’s great for specific use cases, and I ran into one (sidecar container) sooner than I expected.
What are you missing?
My list is very long
- Messaging. We use gRPC and GCP pubsub
- Elasticsearch
- GCP APIs
- A more reliable redis library
- HTTP/2 server
- GraphQL client that supports the full spec
That’s only the technologies we currently use in our stack. We’re evaluating a few more (nats streaming, foundation db) which do not have OCaml support also.
Would you mind adding these to the missing pieces page on OCamlverse?
That’s a good page. I’ll spend some time populating it this evening.
I’ve started implementing bindings for Foundationdb. Let me know if you’re interested in collaborating
Since we have arakoon in the OCaml ecosystem, what’s the need for foundationdb?
I would rather contribute to your ocaml-graphql-server
project first since that’s something that I would definitely use. I read the GraphQL spec, and I’ve been experimenting with Angstrom to see how I can help.