Stdlib.List loop unrolling

The benefit of TRMC is that your function is tail recursive, so won’t blow-up the stack for large lists. Doing a tail-recursive map which retains the performances of the non-tail-recursive one is quite the challenge (the following thread has all the info). A very crude summary:

  • Usual List.map : fast calls (which can further be unrolled) but not tail recursive, you may blow up the stack for long lists.
  • Simple tail-recursive List.map by doing reverse at the end: the cost of reversing hurts quite a bit in practice and makes this non competitive for medium sized lists.
  • Unsafe tail-recursive List.map by using Obj.magic and casting a mutable list that is allocated and modified in place. This relies on assumptions on unsafe features, which may be invalidated if you change backend or runtime (flambda, multicore,…).
  • Clever safe tail-recursive List.map (as explained in the linked discussion). Very clever, needs to be hand-tuned (to recover the performances of non tail-rec) and the code is quite complex which becomes a maintenance problem, especially if you want several versions hand tuned for several architectures.

TRMC, to simplify, translates the simple natural List.map (and other similar functions) to the version that creates a mutable structure. But the translation is done by the compiler. There is still a maintenance burden (the TRMC transform itself within the compiler) but it’s more general so the cost is deemed worth it (since now you don’t have to maintain a gazillion of unsafe tail recursive functions doing Obj.magic all over the place for map, concat, etc…).

1 Like