Interesting OCaml Articles

Looking even closer, I see that this is done on JikesRVM. That’s a research toy, and nobody puts their best GC algorithms into research toys (I was at IBM Research, worked for the same folks). Furthermore, they didn’t compare their work against Bacon et al’s RC-GC (implemented in JikesRVM), AFAICT. I mean, if you’re going to put in explicit memory-management, I’d think you’d want to compare against RC-GC also, no? [Looks like they’re comparing against an Appel generational stop-and-copy collector]

Look: I get that researchers have to do “research”. But I know from experience that the amount of improvement that can be eked-out of even really awful starting-point GCs is quite substantial: I watched as guys improved the IBM [ETA: edited] product JVM’s GC considerably, using all sorts of low-level tricks.

ETA: I mean, what did we learn that we didn’t know already from work going back three decades? That explicit memory-management can use less memory than generational stop-and-copy GC? Check. That copying GC can be faster than explicit memory management? Check. That there’s a space-time tradeoff there? Check.


Let me share my new blog post on understanding format6 with examples.

It’s almost my reading note for the paper Format Unraveled (on module Format) and experiments on utop. I tried not to be too verbose though.


Well, I made a sequel of format6 post,
Understanding format6 in OCaml by diagrams

This time I just use four examples with four diagrams e.g. it’s the one for Scanf.sscanf

p.s. It’s a pity that I missed Gabriel’s post The 6 parameters of (’a, ’b, ’c, ’d, ’e, ’f) format6 after writing that one.


Not primarily a programming article but I thought this is an interesting exception because it may be the first time OCaml has been mentioned in the Financial Times: Jane Street: the top Wall Street firm ‘no one’s heard of’  | Financial Times


However, OCaml is not given a very honorific qualification: a recondite programming language!

1 Like
1 Like

Barring some serious questions about the author’s taste in language design:

Similarly, Rust the language is quite pretty, syntax-wise. Meanwhile, OCaml is so ugly that the community came up with a whole other syntax for it.

I have to agree with the author’s complaint regarding the lack of effective support for metaprogramming in OCaml:

Macros in Rust are great! OCaml has PPXes, which are separate binaries that you build using the OCaml compiler toolkit. They have a very high barrier to entry, and I’ve never built one, and really struggled to even understand the ones I use.


Pretty and ugly is so subjective. I’m always surprised when programmers fall into that trap. It’s almost always a matter of what you’re used to.


You know their oracular memory management used traces to insert “free” as early as possible for a given input. So the programs benchmarked weren’t even necessarily correct in the general case!

1 Like

Too funny. Too funny.

1 Like

Hi, I would like to share my recent article about GADTs and state machines: GADTs and state machine

It’s another introduction about GADTs and it explains a bit what I did for Enjoy it and happy hacking!


An interesting paper that uses OCaml is by Francois Pottier, which gives a declarative DSL for implementing type rules with applicative functors. It has an associated library, opam - inferno.


Our @yallop is one of the authors of this paper about parsing.

flap: A Deterministic Parser with Fused Lexing

Lexers and parsers are typically defined separately and connected by a token stream. This separate definition is important for modularity and reduces the potential for parsing ambiguity. However, materializing tokens as data structures and case-switching on tokens comes with a cost. We show how to fuse separately-defined lexers and parsers, drastically improving performance without compromising modularity or increasing ambiguity. We propose a deterministic variant of Greibach Normal Form that ensures deterministic parsing with a single token of lookahead and makes fusion strikingly simple, and prove that normalizing context free expressions into the deterministic normal form is semantics-preserving. Our staged parser combinator library, flap, provides a standard interface, but generates specialized token-free code that runs two to six times faster than ocamlyacc on a range of benchmarks.


We published a blog post that might be interesting to OCaml devs.
When working with large codebases such as Tezos Octez, it is important to make the code highly readable.
Discover “labelled type parameters” - a lesser-known OCaml trick used by Nomadic Labs devs to reach this objective: Nomadic Labs - Labelled type parameters in OCaml


I’ve very recently written a blog post titled:

“My Thoughts on OCaml vs Haskell/Rust in 2023”

You can also discuss it here:

Hopefully you will find it interesting and worthy of being included in this thread!

Just for the record, there are multiple GC implementations for Rust, for example: GitHub - kyren/gc-arena: Experimental system for rust garbage collection

1 Like

I had fun reading Playing with Caml Light on DOS, published today (it is not obvious from the title).


This is not about OCaml per se but we see discussion about Unicode and its handling in OCaml repeatedly coming up. I found this article informative.


I just finished reading this. Thank you for posting. Can I please please please suggest you make a front-page article about it? And maybe put the “Conclusions” in the post, to kickstart discussion? I have to say: I was surprised to learn that even UTF-32 is not fixed-width.

ETA: we should all read this article. All of us. And I basically never write anything user-facing. Still, I’mm glad I read this.


Note that this article is still an approximation, in particular extended grapheme clusters don’t really match human-perceived characters because they can’t account for ligatures. For instance, a font renderer may decide to render aesthetic as æsthetic. Similarly, some scripts have a quite complex text layout where segmentation into individual characters is subjective: how many characters in द्ध्र्य, ශ්‍ර, :family_man_woman_boy:, or ﷺ ? Note that I can’t even predict the answer on your screen for the first two because it will depend on how well your system fonts support indic scripts, and typically on my system द्ध्र्य and द्ध्र्य are rendered with a different number of “characters”. And this is not even touching the issue of hieroglyphic control characters whose implementations are a work-in-progress … everywhere.