Transmissibility of protection of parameters of type value from garbage collection

The Ocaml manual states in Chapter 20  Interfacing C with OCaml that when writing to ocaml’s FFI "A function that has parameters or local variables of type value must begin with a call to one of the CAMLparam macros and return with CAMLreturn, CAMLreturn0, or CAMLreturnT"

Is it correct to treat this, so far as concerns function parameters of type value, as applying only to those which are not already the subject of protection? In other words given an exported procedure ocaml_func returning void, and another procedure helper_func with internal linkage returning void, is the code below OK so that helper_func's arguments are the beneficiary of the application of CAMLparam1 and CAMLlocal1 in ocaml_func, with the result that helper_func does not need to repeat their use to protect against moving on heap compaction?

static void helper_func(value v1, value v2)
{
  [ ... do something which might trigger a collection ... ] ;
  [ ... do something with v1 and v2 ... ] ;
  return;
}


void ocaml_func(value v1)
{
  CAMLparam1(v1);
  CAMLlocal1(v2);
  value v2 = [ ... ] ;
  helper_func(v1, v2);
  CAMLreturn0;
}

Yes, that should be fine.

Cheers,
Nicolas

Section 20.6 A Complete Example gives the similar treatment for the static fn alloc_window. So this is assumed to be right in the manual.

My understanding is that your code is not correct.
The CAML* functions have two purposes:

  • Ensuring that the values are still considered alive by the GC
  • Keeping the C local variables and parameters pointing to the correct location if values get moved during GC

For the first purpose, it is enough to register the values once, but if you use v1 and v2 in helper_func after a GC then without the CAMLparam macros you can’t be sure that they’re still valid.

I think this example should be patched to use the CAMLlocal macro. It will actually work as the only place where GC can happen is the call to caml_alloc_custom, and at this point v is not yet defined; but I don’t think that the manual should encourage this kind of reasoning.
You can actually look at this section, after the first alloc_list_int example there is a small paragraph explaining this.
As far as I know every rule for interfacing with the GC applies to static functions as well as non-static ones.

Interesting. I don’t think the alloc_window example fits my case. That function takes a pointer to the C heap (the curses WINDOW* argument) and wraps it in a custom block, which as I understand it is the recommended way now of dealing with value types which are pointers into the C heap (the purpose being to suppress the collector accidentally following the pointer when scanning for live objects). Nothing is done in that function after the call to caml_alloc_custom which could trigger a collection - the Window_val macro just twiddles bits - so the use of CAMLlocal in the alloc_window function is unnecessary.

My case is somewhat different. In my case helper_func takes two pointers, v1 and v2, which point to live objects in the ocaml (garbage collected) heap, and as things are done in helper_func which could trigger a collection before those pointers are used in the function, “cover” by application of CAMLparam and/or CAMLlocal is unquestionably needed. The question is whether the application of those macros in the ocaml_func function provides that cover so as to prevent the objects being moved on possible heap compaction while helper_func is executing.

So my question is whether the protection provided by the CAMLparam and CAMLlocal macros remains in effect for the period during which helper_func’s stack frame becomes the innermost stack frame. I assumed the answer was yes, but this does considerably affect some code I am writing. If it does remain in effect then I can implement the equivalent of helper_func in that code as a variadic template function. If not I have to write a version of the function for every possible number of arguments so I can apply the CAMLparam macros to them. (As an aside, that also causes other interesting issues, concerning the extent to which I can rely on the template function remaining inlined rather than being unhelpfully size-optimized into actual function calls having their own stack frame.)

So it is a practical issue for me.

I’m not 100% sure, but I think your code may have problems. Re-naming variables to v3 and v4 for clarity:

static void helper_func(value v3, value v4)
{
  [ ... do something which might trigger a collection ... ] ;
  [ ... do something with v3 and v4 ... ] ;
  return;
}


void ocaml_func(value v1)
{
  CAMLparam1(v1);
  CAMLlocal1(v2);
  value v2 = [ ... ] ;
  helper_func(v1, v2);
  CAMLreturn0;
}

When gc is triggered, it may have to move around values in memory. It knows to up-date &v1 and &v2 to contain the new locations of the OCaml values associated with v1 and v2, because CAMLparam registered “roots” with a global variable (list) called caml_local_roots. It does not however know that it needs to update &v3 and &v4 because they were never registered. You pass them by value to helper_func, so they are temporary copies with different value*. If you passed value*s instead to helper_func it might work.

A way I sometimes use to test things in these tricky situations is:

extern "C" void caml_gc_full_major(void);
/* this is the function called by Gc.full_major() */

static void helper_func(value v3, value v4) {
  /* force gc */
  caml_gc_full_major();
  [ ... your usual code ... ]
}

You should not need to change any build rules to test this way, because even though caml_gc_full_major is not public, it is part of the OCaml runtime being linked into your executable already.

Hope this helps.

Thank you. I think you have more extensively identified the issue that was troubling me.

From what I can see ocaml appears to keep its local roots in stack objects comprising caml__roots_block structs, the current one for the innermost stack frame being pointed to by a field of a global Caml_state object. These roots are accessible by the Caml_state_field(local_roots) macro. When a CAMLparam macro is invoked a local reference to the previous stack object is taken in a caml__frame pointer. and the global state object is replaced by another which has the previous local roots object as its ‘next’ member, and additional references are added for the new parameters in its ‘tables’ member. When CAMLreturn is invoked the reference to the former local roots object is restored.

I suppose what it comes down to is whether these macros temporarily prohibit the objects to which the protected pointers refer from being moved on heap compaction, in which case I am OK, or whether they mutate the addresses held by the pointers to enable the pointers to follow any move, in which case I am not unless as you say helper_func has parameters of type value* instead of value. It looks as if it may be the second case which applies given that the ‘tables’ field stores by value* and not value. But to be honest I am somewhat lost in macrodom.

It is the second case, indeed.

I think we have bottomed this one out and thank you for your help. I think having helper_func taking value* objects is the best choice for my use case, although it does require any argument whose address is taken to be a lvalue, with the result that immediate (unboxed) values not requiring protection can no longer be passed as unnamed temporaries. Hey ho, in my particular use case that is probably better than making helper_func non-variadic and writing the relevant versions of it out by hand.

Edit: I think it might be better to make the arguments of type volatile value* to prevent adverse optimizations, although I notice that the tables field of the caml__roots_block does not do so.

As many others have pointed, indeed it is not safe to use v1 and v2 in helper_func when helper_func can trigger a collection. In my mind I accidentally deleted “might trigger a collection” comment, probably because I was just looking at some code that did what is being discussed (but in a function that does not trigger the GC, so in a case where it is safe!).

Cheers,
Nicolas