Calls to C functions can be made cheaper by the native-code compiler if annotated with [@@noalloc].
One of the conditions is that the C function doesn’t allocate (the other being that it doesn’t raise any exception).
My question is how to know that a function does not allocate?
Does it simply mean that none of the macros listed in the “Allocating blocks” section of the manual should be used by this function? If I stick to the so-called “Simple interface”, it would mean no use of:
Atom
caml_alloc*
caml_copy_*
My second question (maybe related?) is why aren’t the “Living in harmony with the garbage collector” section simple rules respected in the following example from the manual?
CAMLprim value foo_byte(value a, value b)
{
return caml_copy_double(foo(Double_val(a), Double_val(b)))
}
I would have expected to read the following instead:
CAMLprim value foo_byte(value a, value b){
CAMLparam2(a,b);
CAMLreturn(caml_copy_double(foo(Double_val(a),Double_val(b))));
}
Well, in general, it is hard to tell, so if not sure, then don’t annotate. For example, if a function operates only with unboxed values, and doesn’t invoke any OCaml runtime’s function, then you could rest assured - that it doesn’t allocate something.
Also, if you can’t tell whether a function allocates or not, it likely means that the function is big. For such functions, adding the noalloc annotation is futile, as it will save only a couple of instructions, that is negligible with the overall size of the function. So leave this optimization to small, getter/setter style of functions.
There is nothing good in the idea of not following simple rules, especially in the manual. So I would treat this is a bug in the manual. However, the code is still safe, or correct, to some extent of the word “correct”. Although values a and b are not registered as local roots, their payloads are passed to the foo function, and since C passes doubles by value, they got copied. So any involvement of a garbage collection may no longer spoil the values. And, of course, the return value is also copied, so when caml_copy_double is invoked the input value is totally independent of the input values a and b.
Of course, your version of the function is much better, as it will not require that deep reasoning for everyone, who will edit this function later, and who may add some code before the call to foo that will invalidate a and/or b.
There might be a significant difference in performance between the two, though. Since you know that a value cannot be a root, if the function is performance critical, there might be an argument to carefully commenting the usage rather than registering it anyway.
Thanks for your explanation. I do not find that it is such a deep reasoning (at least in the case of this trivial function). If/when I’ll find the time, I’ll post some more concrete examples, to get more of such reasoning.
After some more concrete example, I may understand the meaning of “C function doesn’t allocate”.