Array.map vs List.map performance

patrick-nicodemus · April 22, 2024, 3:51pm

All other things being equal, would you expect Array.map or List.map to be faster, given that the arguments have the same number of elements n?

As far as I can tell, this should be equivalent to “Is it faster to allocate many small regions of memory or one large region of memory”. One also has to traverse the input list, which is bad for cache locality.

Maybe the question depends on the size of n.

I would guess that Array.map is faster but this is just intuition, I would like to know if someone has benchmarked this or has internal knowledge of how OCaml allocates memory that would point in one direction or the other.

vlaviron · April 22, 2024, 4:38pm

It depends.

In isolation, Array.map is usually a bit slower, as it involves GC barriers. But the time of the function itself is often less important than the impact on the rest of the code.

For very small values of n, lists could end up much better because the optimiser can track them (this is mostly relevant with the native compiler with flambda on). For instance, List.map f [x] can be optimised into [ f x ], removing the initial allocation for [x], and if you chain list-manipulating functions you could end up not actually allocating any lists except the last one. But that’s very limited: you need manual annotations ([@unrolled]) if you want that to work with lists bigger than 1, even at -O3.

For very large values of n arrays will likely be better due to the reduced memory footprint. Additionally, they have better cache locality both because of the more compact representation and the guarantee that all elements are next to each other, so when iterating on the result it will likely end up faster.

You should definitively run your own benchmarks. Not only between List and Array, but changing also the context (size, what kind of function is being mapped) can be very instructive.

patrick-nicodemus · April 22, 2024, 6:07pm

Thanks for the insight. Can you explain you mean by “GC barriers”?
I will definitely run my own test, I see what you mean about manipulating more of the variables to get a better understanding.

silene · April 23, 2024, 3:42am

In the case of Array.map, the target array preexists any of the values (f src.(i)) that are being computed to fill it. Therefore, the array might be old enough to have been moved to the major heap while some of its content is still fresh enough to live on the minor heap. As such, the runtime needs to perform some extra work so that the values living on the minor heap are not incorrectly freed during the next garbage collection. This costly extra work is the “GC barrier”.

In the case of List.map, the new list cell is created right after the value that fills it has been computed, which means that younger values point to older values, and not the opposite. As such, no values can be mistakenly freed, so no extra work is need.

patrick-nicodemus · April 23, 2024, 8:41pm

Interesting, thank you!

Jon_Harrop · May 13, 2024, 7:58pm

The answer will depend not only (strongly) on n but also on the element type and the function argument, particularly the allocation patterns of the function argument.

If you map (+) 1 over a collection of ints I’d expect the performance to be comparable except for an earlier jump to out-of-cache with List.map because it requires more memory.

If you map something that allocates a temporary over a collection of floats then I’d expect Array.map to be substantially faster.

Chet_Murthy · May 14, 2024, 1:45am

A few years ago, I had a conversation with Jean-Christophe Filliatre, about this very subject. He was pretty firm on the position that for almost all the typical uses we make of lists, arrays are faster and more compact. My memory was that he had benchmarks on nontrivial codebases, behind this position, but I could be misremembering.

Separately, in 1992-4 while implementing Coq V5.10, I found that to represent (M N1 N2 ... Nk) (the iterated application of M to N1, then … Nk), representing the args as an array was much faster than as a list, and this was on the entire corpus of Coq’s benchmarks at the time.

ETA: There are two cases where arrays really win big: (1) when data is read more than consed, e.g. in a theorem-prover where you recurse over data a lot, Array.map incurs fewer pointer-follows than List.map, hence much faster; (2) arrays are much smaller (one cell instead of three), so heap-size savings.

Topic		Replies	Views
Fastest implementation of stack-safe List.map, featuring Containers, Batteries, and Base Community	4	1395	June 22, 2017
Stdlib.List loop unrolling Learning	11	973	January 28, 2023
`float array` and `floatarray` have similar performance in my benchmark Learning	9	797	February 14, 2023
A new List.map that is both stack-safe and fast Community article , performance	34	12655	February 24, 2018
Performance difference between bigarray inside record and plain bigarray access Learning performance , beginner	8	751	August 16, 2021

Array.map vs List.map performance

Related topics