I’m learning how to write C code that interfaces with the OCaml GC, and after reading about the CAMLparamN(...)
, CAMLlocalN(...)
, and CAMLreturn(...)
macros, and the manual’s “simple” rules, I was left confused about what was and wasn’t safe in some less-common cases. In particular, neither the descriptions of these macros nor the manual’s examples make clear what you should do if you want to retain garbage-collected objects within nested functions, i.e. C functions called by other C functions, and never directly from OCaml. I would hope you could use the same macros, but there’s no example (that I’ve found in the manual) that demonstrates this.
Normally I try to answer these sorts of questions by getting a better understanding of the implementation, so I grep
ped for where these macros were defined, and had a good long look at memory.h
. My understanding of how these macros work is now as follows:
-
GC roots on the C stack are indicated by a separate stack (implemented as a linked list of frames) full of pointers to on-stack memory locations. The frames of this stack are directly allocated within the C stack, and a pointer to the most-recently-created frame is kept in the global variable
caml_local_roots
. -
Each of the
CAMLlocalN
macros adds a new frame to this stack, pointing to the newly declared variables as GC roots. -
Each of the
CAMLparamN
macros saves the existingcaml_local_roots
to the local variablecaml__frame
. -
The
CAMLreturn
macro restorescaml_local_roots
tocaml__frame
, so that the GC roots used by the function are forgotten.
Now understanding this implementation, my original question was answered, but I noticed another set of macros later in the file, specifically the Begin_rootsN
and End_roots
macros. Though they were mentioned nowhere in the manual, they seemed much more straightforward (and a bit more flexible). And I noticed the comment above them:
NOTE: [Begin_roots] and [End_roots] are superseded by [CAMLparam]*, [CAMLxparam]*, [CAMLlocal]*, [CAMLreturn].
I can think of a few reasons why the newer macros might have been introduced in their place (even if I find these more primitive ones more obvious), but I would still like to use Begin_roots
and End_roots
in my own code. So what does “superseded” mean, in this context? Are they deprecated, and in the header only for compatibility reasons? Were they incompatible with some new variation on the GC that’s looking to land in OCaml? Or is it perfectly safe to use them, even if no-one apparently thought I would want to?
And a minor side question…
The layout of these stack frames is rather curious. I would have expected them to be alloca
'd blocks of stack memory with a header specifying the number of roots each one contains, but they seem to instead be fixed-size blocks of pointers to up to 5 “tables”, where each table can have more than one root (but where the root count for all referenced tables by a frame is specified at the same time, as nitems
).
What’s the history behind this layout choice? Are these extra “features” used inside the bytecode interpreter for some purpose?