Has there ever been any talk of reimplementing the OCaml standard library recursive (tail or otherwise) functions with imperative loops? It wouldn’t change their interfaces or runtime behaviour (except performance) and it would help on platforms which don’t have good tail call elimination.
Base (and thereby Core_kernel and Core) have tail-recursive implementations of the list functions. They’re not done with imperative loops, but instead with recursive functions using accumulators. You can read more about the technique here:
I believe that Batteries also avoids normal recursion e.g for all of the fold functions.
an OCaml backend that doesn’t perform proper tail call elimination is simply broken, and we should not make stdlibs’ code uglier for no good reason.
Are you talking about “tail recursion modulo constructor”?
Thanks @Yaron_Minsky, that was a good read.
I was more thinking about BuckleScript, which produces JS artifacts which can be deployed to browsers. Unfortunately browsers don’t uniformly implement TCE, see e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=753705
So I understand the argument that stdlib code should not be ugly (to the extent possible), maybe it would be a good idea as @mars0i suggested to use an overlay module like from Batteries on platforms which don’t support recursion well.
On a serious note, I concur with @c-cube’s comment: a specific back-end, should not determine the use of tail-recursion in standard libraries; especially a browser of all targets I add.
Note that standard solution to this problem is the usage of trampolines. However, generated code will be slightly more complicated (and Bucklescript tries to generate human-readable JS).
As an aside, it may be useful to transform a subset of recursive functions into loops where possible for the sake of automatic vectorization.
Oh that’s cool, I’ll try out a few variations and see how far I can take it.
If you refer to optimization for parallel computing as explained here with automatic vectorization with loops. Use monoids & stuff like that you will have easy parallel computing thanks to not using loops. Personally I use Functory library for that. Parallel computing is easier without loops.
Parallel computing isn’t the same as vectorization. Vectorization is closer to using a GPU, and you need loop structures of some kind to do the analysis properly. Parallel computation and vectorization are orthogonal. However, on most code, vectorization has minimal impact, so it’s admittedly not a huge deal.