There have been various discussions on the subject, of course.
But let’s start with the definitions: what does it mean for a value to be computed at compile time ?
Remember that OCaml is mostly a functional language. In particular in the intermediate representations that are used for optimisations, we use something that’s close to an extension of the lambda-calculus.
So what does it mean for a lambda-term to be evaluated at compile time ? Well, the most obvious answer is to beta-reduce it until we get a value (i.e. not an application). But that’s not always possible: if you get an application of the form x y
, where x
is a variable on which you do not have information in this scope, you can’t reduce it further. In real OCaml programs, this can happen either if you try to do compile time computations inside functions (let foo f = let y = (f[@comp]) 0 in y
) or if you access compilation units for which the .cmx
file is absent (in your List.sort
example, if you don’t have access to list.cmx
you can’t reduce it further). Values that are the result of the call to a C function (or Sys.opaque_identity
) also fall broadly into this case. This is not that big a problem: if the compiler can’t reduce an application, it can emit a warning saying that it couldn’t evaluate the expression at compile time and move on.
In the Flambda world, this should be doable through inlining. If we inline all functions calls that occur under a given [@comp]
annotation, we could produce something that looks (after simplification) reasonably like what you would expect. In fact, if I compile let lst = List.sort Int.compare [1;2;7;4;5;6;0;3]
with ocamlopt -O3 -inline-call-cost 100 -inline-max-unroll 10
I get something equivalent to let lst = [0; 1; 2; 3; 4; 5; 6; 7]
. Unfortunately there’s currently no way to set the inlining parameters for a specific part of the code only, and of course knowing which exact parameters to pass is not exactly trivial to find. But it shows that if we wanted to add support for your [@comp]
attribute, we could already go a long way with minimal additional effort.
However, if we try the same approach with Array.sort
then we quickly hit a wall: loops and operations on mutable structures are never simplified, so even is you inline all function calls you’ll never get a single result.
Simplifying loops is not completely impossible. As a proof of concept, we could rewrite while _test do _body done
to let rec loop () = if test then (_body; loop ()) in loop ()
and reuse the unrolling mechanism. But loops are all about doing side-effects, and that’s where it gets complicated. Tracking the contents of values with mutable fields is very tricky: if you can’t get 100% sure that you know all the places where the value gets modified, you can’t do any meaningful simplifications. So that means that the optimiser needs a precise escape analysis (to track where each mutable value might end up), which tends to be expensive. After that, you need to track the effect of evaluating the expressions during the traversal. In practice, this means that if your symbolic evaluator keeps its environment as a map from variables to symbolic values, instead of simply passing the environment as argument during the traversal of sub-expressions, you have to also return it and be careful to always use the updated version. This also forces you to choose an evaluation order for all expressions that have multiple sub-expressions.
All of that is a lot of work, but it can be useful for other reasons too, so most of the required pieces of code have already been done in one way or the other on our experimental Flambda 2 branch. (The escape analysis part is very incomplete though; for now we only have the code from the main compiler that translates non-escaping references to mutable variables. For the purpose of compile-time evaluation of mutable structures we would need to generalise it to other data types and integrate it in the simplifier.)
But there is one last issue: the array
type. It’s a built-in type, but some of the primitives manipulating arrays are actually implemented in C, in the runtime. So to actually simplify code using arrays, we need to hardcode in our simplifier (or symbolic evaluator) the semantics of all meaningful C functions of the runtime. Not impossible, but a lot of code to write and maintain.
To sum up, the feature you propose makes sense from the user’s point of view, and a lot of the code needed to make it work is already there. But to make it work in all cases is still a long way ahead, and we’re not very keen on introducing a feature that only works partially, without a good specification of which cases are handled and which cases are not (cough [@tailcall]
cough). Feel free to post a feature request on the main tracker about it though, maybe someone will find the time and motivation to work on it at some point.