I’m learning how to write C code that interfaces with the OCaml GC, and after reading about the
CAMLreturn(...) macros, and the manual’s “simple” rules, I was left confused about what was and wasn’t safe in some less-common cases. In particular, neither the descriptions of these macros nor the manual’s examples make clear what you should do if you want to retain garbage-collected objects within nested functions, i.e. C functions called by other C functions, and never directly from OCaml. I would hope you could use the same macros, but there’s no example (that I’ve found in the manual) that demonstrates this.
Normally I try to answer these sorts of questions by getting a better understanding of the implementation, so I
grepped for where these macros were defined, and had a good long look at
memory.h. My understanding of how these macros work is now as follows:
GC roots on the C stack are indicated by a separate stack (implemented as a linked list of frames) full of pointers to on-stack memory locations. The frames of this stack are directly allocated within the C stack, and a pointer to the most-recently-created frame is kept in the global variable
Each of the
CAMLlocalNmacros adds a new frame to this stack, pointing to the newly declared variables as GC roots.
Each of the
CAMLparamNmacros saves the existing
caml_local_rootsto the local variable
caml__frame, so that the GC roots used by the function are forgotten.
Now understanding this implementation, my original question was answered, but I noticed another set of macros later in the file, specifically the
End_roots macros. Though they were mentioned nowhere in the manual, they seemed much more straightforward (and a bit more flexible). And I noticed the comment above them:
NOTE: [Begin_roots] and [End_roots] are superseded by [CAMLparam]*, [CAMLxparam]*, [CAMLlocal]*, [CAMLreturn].
I can think of a few reasons why the newer macros might have been introduced in their place (even if I find these more primitive ones more obvious), but I would still like to use
End_roots in my own code. So what does “superseded” mean, in this context? Are they deprecated, and in the header only for compatibility reasons? Were they incompatible with some new variation on the GC that’s looking to land in OCaml? Or is it perfectly safe to use them, even if no-one apparently thought I would want to?
And a minor side question…
The layout of these stack frames is rather curious. I would have expected them to be
alloca'd blocks of stack memory with a header specifying the number of roots each one contains, but they seem to instead be fixed-size blocks of pointers to up to 5 “tables”, where each table can have more than one root (but where the root count for all referenced tables by a frame is specified at the same time, as
What’s the history behind this layout choice? Are these extra “features” used inside the bytecode interpreter for some purpose?