The documentation gives these rules for binding a C library.
Is it safe for me to relax the rules with the following assumptions?
If a value is unboxed and does not contain pointers to the OCaml heap (e.g. it is an int or a bool), it does not need to be registered with CAMLparam or CAMLlocal. Rationale: The OCaml GC cannot invalidate them and doesn’t need to know them to trace live values.
If the function does not allocate on the OCaml heap, it does not need to use CAMLparam or CAMLlocal at all. Rationale: The GC can only be triggered upon an allocation. Therefore, if the function does not allocate on the OCaml heap, it does not need to register its local variables as roots because the OCaml GC will not run over the duration of the function call. Because the multicore runtime relies on poll points inserted in the compiled OCaml code to context switch, it also cannot preempt a C function.
If these assumptions are correct, is it a good idea for me to write bindings in this way?
Your 1. is correct. Your 2. must be strengthened into “the function does not call the GC or release the runtime lock”. There are other ways the GC can run than by calling an allocation function (this is currently partly in undocumented territory).
Oh yeah, I have a third rule to ask about. If a function allocates on the OCaml heap, but it does not have any local variables that contain pointers to the OCaml heap, it is also safe to skip calling CAMLparam at all, right?
Only if you haven’t registered any callbacks. If someone has registered an OCaml function to be used as a callback by some C code, and you call into a C library which under some circumstances ends up calling this callback then it will run OCaml code while having your C function on the stack.
This applies transitively (there might be OCaml callbacks registered in other C libraries that your C library indirectly uses), although that’d be a rare case. If your library never registers any callbacks you probably shouldn’t worry about this case.
Hey, what a coincidence that you decided to show up! What you’re pointing out is actually something that I actually realized I need to worry about…
I’m currently patching the LLVM bindings to be compatible with OCaml 5. The LLVM bindings, as they currently stand, use naked pointers and omit CAMLparamX and CAMLreturn everywhere, making them unsuitable for the multicore runtime. I have patched them to encode all pointers outside the OCaml heap as OCaml integers, and inserted the necessary CAMLparamX and CAMLreturn macros.
I had been wondering if I could get away with skipping these macros in some places. However, one of the features of the LLVM bindings is a “diagnostic handler,” and the bindings allow an arbitrary OCaml function to be registered as a handler. I’m not sure which LLVM functions may call the handler, so I have to assume that all of them may trigger the OCaml GC.
Even more concerning is what would happen if the diagnostic handler raised an OCaml exception. It would likely violate many assumptions in the LLVM code, skipping destructors and more.
I’m concerned because two reviewers I’ve tagged have resigned, and I think that there are few people who are deeply familiar with both LLVM and the OCaml runtime. I’m bringing this up because I noticed that you were a contributor to the LLVM bindings, many years ago. Perhaps you could have some expertise to add?
See OCaml - Interfacing C with OCaml, in particular this part, the C bindings should probably use caml_callback_exn, check the result and then raise a C++ exception to avoid the problem with destructors:
Blockquote
If the C code wishes to catch exceptions escaping the OCaml function, it can use the functions caml_callback_exn, caml_callback2_exn, caml_callback3_exn, caml_callbackN_exn. These functions take the same arguments as their non-_exn counterparts, but catch escaping exceptions and return them to the C code. The return value v of the caml_callback*_exn functions must be tested with the macro Is_exception_result(v). If the macro returns “false”, no exception occurred, and v is the value returned by the OCaml function. If Is_exception_result(v) returns “true”, an exception escaped, and its value (the exception descriptor) can be recovered using Extract_exception(v).
I would recommend trying to write a small unit test with the problematic code sequence (set up OCaml diagnostic handler, call some other OCaml functions that would call into the C bindings that would call into this diagnostic handler, and raise an exception), and check it works correctly with the fixed C binding, and then try to see whether you can somehow prove it is not correct with the old code, e.g. by using the OCaml debug runtime, or inserting some code to Gc.compact in some places.
Detecting missing C++ destructors would be more tricky, they might result in a memory leak, so perhaps valgrind could catch something? Or build with the memory sanitizers, I think lately they can catch memory leaks too.