Array.map vs List.map performance

All other things being equal, would you expect Array.map or List.map to be faster, given that the arguments have the same number of elements n?

As far as I can tell, this should be equivalent to “Is it faster to allocate many small regions of memory or one large region of memory”. One also has to traverse the input list, which is bad for cache locality.

Maybe the question depends on the size of n.

I would guess that Array.map is faster but this is just intuition, I would like to know if someone has benchmarked this or has internal knowledge of how OCaml allocates memory that would point in one direction or the other.

It depends.

In isolation, Array.map is usually a bit slower, as it involves GC barriers. But the time of the function itself is often less important than the impact on the rest of the code.

For very small values of n, lists could end up much better because the optimiser can track them (this is mostly relevant with the native compiler with flambda on). For instance, List.map f [x] can be optimised into [ f x ], removing the initial allocation for [x], and if you chain list-manipulating functions you could end up not actually allocating any lists except the last one. But that’s very limited: you need manual annotations ([@unrolled]) if you want that to work with lists bigger than 1, even at -O3.

For very large values of n arrays will likely be better due to the reduced memory footprint. Additionally, they have better cache locality both because of the more compact representation and the guarantee that all elements are next to each other, so when iterating on the result it will likely end up faster.

You should definitively run your own benchmarks. Not only between List and Array, but changing also the context (size, what kind of function is being mapped) can be very instructive.

1 Like

Thanks for the insight. Can you explain you mean by “GC barriers”?
I will definitely run my own test, I see what you mean about manipulating more of the variables to get a better understanding.

In the case of Array.map, the target array preexists any of the values (f src.(i)) that are being computed to fill it. Therefore, the array might be old enough to have been moved to the major heap while some of its content is still fresh enough to live on the minor heap. As such, the runtime needs to perform some extra work so that the values living on the minor heap are not incorrectly freed during the next garbage collection. This costly extra work is the “GC barrier”.

In the case of List.map, the new list cell is created right after the value that fills it has been computed, which means that younger values point to older values, and not the opposite. As such, no values can be mistakenly freed, so no extra work is need.

2 Likes

Interesting, thank you!