However, I saw no evidence of automatic unrolling by the compiler in List.map so far, which makes me suspect that suppressing benefits from automatic unrolling followed by inlining is not yet an issue with current OCaml.
Yeah, as we discussed it previously, this is the matter of the computation kernel. If the kernel is small, then its inlining would be neglible wrt to the list processing overhead. And since you’re actually measuring this overhead it right to have a small kernel. However, what I’m saying is that in real life, the kernels tend to be bigger, so the performance impact of the list processing is smaller. So it is like competitive goals - your implementation reduces the list overhead, while preventing flambda from reducing the kernel overhead.