Fastest implementation of stack-safe List.map, featuring Containers, Batteries, and Base

antron · June 22, 2017, 1:16am

…presumably, this also applies to related functions.

@jsthomas has posted some interesting measurements of different ways to implement List.map. Some points/questions:

The overall goal is to have a tail-recursive List.map that won’t crash your whole program on a large list (the linked instance was seen on the mailing list just a few days ago!).
It shouldn’t be significantly slower than stdlib’s stack-unsafe List.map. That requires optimizations.

There are at least three effective optimizations: (1) loop unrolling, (2) building the result list front-to-back using mutation, and (3) changing the implementation based on the size of the list.
Containers and Base use (1) and (3), and Batteries uses (2).
@jsthomas was able to combine them all to get a map that looks faster than what’s now in Containers, Batteries, and Base. In particular, it maintains good performance for all list sizes. And, it’s likely possible to make @jsthomas’ map go even faster.

All of Containers, Batteries, and Base are faster than stdlib for small lists. The advantage of Batteries is not as great as the other two, but only Batteries remains faster than stdlib on very large lists.
Why don’t Containers and Base use mutation like in Batteries? Is it because the memory representations of the types involved aren’t guaranteed across different kinds of OCaml runtimes? Are the representations more likely to coincide if one uses 4.03 inline records to declare the mutable list type? Or perhaps performance on large lists is just not that important in practical situations?

There is an interesting observation that making a non-tail call can be more expensive than allocating some heap space (in OCaml). Often, programmers don’t think of a function call as an allocation, but it is indeed allocating stack space. However, the full cost of the space should count deallocation, and stack space is reclaimed faster than garbage collection is done in the heap.

https://jsthomas.github.io/map-comparison.html

c-cube · June 22, 2017, 8:37am

@antron in containers, I don’t use the mutation trick because it’s fundamentally unsafe. It’s Obj in disguise, and it’s lying to the compiler about some values being immutable. So I avoid it because I prefer to err on the safe side (especially with flambda and Pierre Chambart promising to break all code that uses Obj.magic).

If there was a safe way of doing it, I would of course be interested.

zozozo · June 22, 2017, 8:46am

@c-cube well there is https://github.com/ocaml/ocaml/pull/181 , but whether it’s going to be merged is another question.

c-cube · June 22, 2017, 8:48am

Even if it was merged (say, in 4.06), I wouldn’t use it for years, because containers must be compatible with older versions of OCaml.

bluddy · June 22, 2017, 4:27pm

The same problem exists with regard to Multicore OCaml. Much of the design is built around different optimizations that are possible with immutable and mutable values, and this trick of lying to the compiler will almost certainly not work there.

Topic		Replies	Views
A new List.map that is both stack-safe and fast Community article , performance	34	12614	February 24, 2018
Stdlib.List loop unrolling Learning	11	970	January 28, 2023
Stack overflow during evaluation (looping recursion?) Learning	16	1781	April 17, 2023
Array.map vs List.map performance Learning performance	6	1274	May 14, 2024
Rewrite standard library recursive functions as loops? Learning recursion	14	1899	September 1, 2017

Fastest implementation of stack-safe List.map, featuring Containers, Batteries, and Base

Related topics