Pull requests for the compiler suggest that Int_val should not be used inside caml_enter_blocking_section(), and in particular not in a parameter position of an FFI call like
close_unix.c- caml_enter_blocking_section();
close_unix.c: ret = close(Int_val(fd));
> As discussed in #12737, using Int_val inside blocking sections can cause data races and is now seen as a bad idea.
is just a shift, would the code generated by the C compiler not be the same anyway? Trying to make bindings safe to use with OCaml 5, what do we need to look out for?
My understanding is that this is mostly an attempt to respect what the manual recommends OCaml - Interfacing C with OCaml which says:
Consequently, arguments provided by OCaml to the C primitive must be copied into C data structures before calling caml_release_runtime_system(), and results to be returned to OCaml must be encoded as OCaml values after caml_acquire_runtime_system() returns.
However, technically speaking, there is no actual safety issue: registering an immediate as a root will not cause the GC to do anything with it, and neither will extracting the underlying machine integer using Int_val.
The nuance that makes the code in channels_unix.c, connect_unix.c and others safe, is that it’s OK to use an immediate in a blocking section, so long as it’s not registered as a GC root. This nuance is currently not documented in the manual.
We are not entirely certain of that, because the GC does write on registered immediates, as mentioned in #13188. It happens to write the exact same value, but it’s not entirely clear (at least to me) that it’s always safe to read on all platforms. And it’s clearly wrong to write (the immediate may be reverted to its old value by the GC, or worse).
The nuance that makes the code in channels_unix.c, connect_unix.c and others safe, is that it’s OK to use an immediate in a blocking section, so long as it’s not registered as a GC root.
We have:
CAMLprim value caml_unix_chown(value path, value uid, value gid)
{
CAMLparam1(path);
char * p;
int ret;
caml_unix_check_path(path, "chown");
p = caml_stat_strdup(String_val(path));
caml_enter_blocking_section();
ret = chown(p, Int_val(uid), Int_val(gid));
caml_leave_blocking_section();
caml_stat_free(p);
if (ret == -1) caml_uerror("chown", path);
CAMLreturn(Val_unit);
}
Aren’t value parameters GC roots? But only the first one is registered because it is not a simple value. And only because they are simple values it does not matter and using Int_val is safe?
So the detail here is that simple values are parameters but not registered, and hence they are safe to use?
I believe the pattern of registering only complex value arguments using CAMLparam* is present in the compiler’s source code but is not covered in the manual. It gives the impression that all value arguments should be declared as CAMLparam, now forcing to not use Int_val et al. in argument position of the foreign function.
So if in doubt, it is recommended that you follow the manual recommendations. The compiler codebase does not always do so, but then again, it falls under the category “maintained by experts”