Foreign function overhead

I’m trying to call a shared library using ctypes Dl module and I found the function result of Ctypes.coerce introduces 150-160ns call overhead. For reference doing the same using C stub is 8ns in the same benchmark.

let dl_bench =
  let handle =
    Dl.dlopen
      ~filename:"dlopen_example/build/src/libmy_lib.so"
      ~flags:[ RTLD_LAZY ]
  in
  let sym = Dl.dlsym ~handle ~symbol:"plus_one" in
  let typ = Foreign.funptr Ctypes.(int @-> returning int) in
  let dlpo = Ctypes.(coerce (ptr void) typ (ptr_of_raw_address sym)) in
  assert (dlpo 4 = 5);
  Bench.Test.create ~name:"dlsym" (fun _ -> dlpo 4 |> ignore)
;;

I’m having hard time understanding what Ctypes.coerce actually does and how I could improve the performance.

Is there any way I could call a function loaded using Dl.dlsym with lower overhead?

Note: I assume the overhead is not from Dl.dlsym but from coerce itself.

Edit: Nearly the same overhead applies when using ctypes foreign interface.
Edit: On the other hand - it might be Foreign.funptr that causes the overhead.
Edit: I think indeed I now understand, the generic FFI C stub might be the cause of the overhead. I think I’ll opt for a less generic one since I don’t need it. I’m gonna leave this thread hanging since someone might stumble on the same issue.

1 Like

This overhead is because the Foreign module in ctypes goes through the libffi dynamic invocation layer. If you want performance closer to writing a C stub, then you should be using the Ctypes C stub generation mode. This requires a bit more build system integration but you can find an example at GitHub - avsm/ocaml-yaml: OCaml interface to the YAML 1.1 spec.

4 Likes

Another example in directories, that’s the exact commit where we switched from libffi to stub gen.

2 Likes