Designing a C API for an OCaml library

Hi everyone,
I am trying to make a Julia binding of an OCaml library I wrote. To do so, my plan is to first write a C API and then use Julia’s FFI to communicate with the C API.


I found clear instructions in the manual to write a C wrapper for OCaml functions whose input and output types can be naturally represented in C (such as int, string or even arrays). For example, if I want to expose fib: int -> int, I can write the following stub:

int fib(int n) {
  static const value * fib_closure = NULL;
  if (fib_closure == NULL) fib_closure = caml_named_value("fib");
  return Int_val(caml_callback(*fib_closure, Val_int(n)));
}

Also, when manipulating intermediate OCaml values, it is important to use the CAMLlocal macro so as to keep them from being collected by the GC too early.


My question is about dealing with more complex objects such as recursive ADTs. For example, suppose I have a module with the following signature:

type expr (* voluntarily left abstract *)
val random_expr_of_size: int -> expr
val evaluate: expr -> int

Ideally, I would want to translate this into the following C API:

typedef value* expr;
expr random_expr_of_size(int size);
int evaluate(expr e);
void release(value* obj);

The idea is that the C equivalent of random_expr_of_size would return a pointer to an opaque OCaml value, which is not to be manipulated directly but rather passed to other functions (such as evaluate). For this to work though, one must have a way to manually indicate the GC when it is safe to collect this value, which is what the release function is for. Note that the calls to release can ultimately be automated on the Julia side by adding hooks to the Julia GC.

However, I am not sure how to implement such a C API. I thought about using caml_register_global_root and caml_remove_global_root but:

  1. I have no idea how efficient it would be.
  2. I imagine one should be careful that the value returned by random_expr_of_size is really a root and no other value points to it. What would be the best way to ensure this? Should I allocate a new shallow copy of the value returned by OCaml (I guess there is no need to copy recursively here) and if yes how can I do that?

I would greatly appreciate your help on this!

1 Like

Generally speaking I think you’re on the right track. Keeping the OCaml structured value opaque on the C side is certainly simpler than trying to manipulate it directly from C.

However, I am not sure how to implement such a C API. I thought about using caml_register_global_root and caml_remove_global_root

Or even better caml_{register,remove}_generational_global_root.

  1. I have no idea how efficient it would be.

Very efficient with caml_{register,remove}_generational_global_root, slightly less with caml_{register,remove}_global_root.

  1. I imagine one should be careful that the value returned by random_expr_of_size is really a root and no other value points to it. What would be the best way to ensure this? Should I allocate a new shallow copy of the value returned by OCaml (I guess there is no need to copy recursively here) and if yes how can I do that?

A GC root can be shared with other values. The GC will figure it out. No need to copy anything. Copying a data structure to make it uniquely owned is for lesser languages :slight_smile:

4 Likes

Thanks for your help!

To be sure, is the following a correct implementation of the API I was referring to?

typedef value* expr;

expr random_expr_of_size(int size) {
  static const value * closure = NULL;
  if (closure == NULL) closure = caml_named_value("random_expr_of_size");
  value res = caml_callback(*closure, Val_int(size));
  expr allocated = malloc(sizeof(value));
  *allocated = res;
  caml_register_generational_global_root(allocated);
  return allocated;
}

int evaluate(expr e) {
  static const value * closure = NULL;
  if (closure == NULL) closure = caml_named_value("evaluate");
  return Val_int(caml_callback(*closure, *e));
}

void release(value* obj) {
  caml_remove_generational_global_root(obj);
  free(obj);
}

Also, the manual says the following about using caml_register_generational_global_root:

In this case, you must not modify the value of v directly, but you must use caml_modify_generational_global_root(&v,x) to set it to x.

Do I have anything to worry about if I treat v as opaque in the C code but have OCaml functions modify it?

Finally, does anyone have a recommendation to automate the generation of the C API? I am thinking of writing a tool that would parse the library *.mli and generate the C stub automatically.