Writing C bindings in 2024

,

Hello,

This is just a quick post to ask what the current “standard” way of writing C bindings is. By hand? Using Ctypes?
I thought it was Ctypes, but the documentation is rather incomplete (or I could not find a more complete documentation); and writing bindings by hand seems error prone.

I have several libraries I would like to write bindings for, including tree-sitter, which already kind of has OCaml bindings, but I’m struggling with building/using it as it is not published on opam, and the library doesn’t seem to expose important features…

Anyway, any pointers are welcome :slight_smile:

1 Like

CTypes is covered here: Foreign Function Interface - Real World OCaml and should be practical to describes C structures.

Note: Using CTypes, the linker doesn’t see dependencies (CTypes uses dynamic binding).

If you want to force the linker to link a library, add a dummy dependency that will force the static link of the library. See the pg_query source extract.

(* Hack needed to make symbols available, see constfun's comment here
 * https://github.com/ocamllabs/ocaml-ctypes/issues/541 *)
external _force_link_ : unit -> unit = "pg_query_free_parse_result"

(It is a dummy link… no need to use the exact types).

The original way is described here: OCaml - Interfacing C with OCaml (but the OCaml value representation is not transparent).

1 Like

I second the question. I assumed Ctypes is the “blessed” way, but then I heard somewhere it’s not actively developed? Also, the most convenient way to use Ctypes is via Foreign, but that’s suboptimal for most usage (where static linking is fine). I’m guilty of using Foreign for bindings that should be static.

Both!

It depends a bit on C API you need to bind to, large or small and which C idioms it uses. Some C APIs are straightforward to bind to (say an abstract pointer and functions acting upon it) and the OCaml FFI poses no problem, other are less (use of records, callbacks etc.)

In any case I advise you to always make thin bindings: that is map as straightforwardly to the C functionality, and then pretty things and make usage safe in OCaml itself behind your .mli (versus starting to construct elaborate OCaml values on the C side, using the more fancy stuff like callbacks or custom operations, except if really needed for finalizers).

Here are a few examples of mine:

  • SDL binding, large API using bare C records, so with ctypes.

  • zstd, zlib, xxhash, md, blake3 bindings. This is all trivial bytes crunching so OCaml FFI.

  • Monotonic and POSIX clock bindings, totally trivial so OCaml FFI.

  • sqlite3 bindings. This is quite an interesting case if you compare it to the size and complexity of the very old ocaml-sqlite bindings which makes a lot of errors that I would also have made if I had written the binding when it was originally writtten (notably IIRC using finalized values for prepared statements is not a good idea: it’s a ressource, you can’t close your database handle if they are still lingering). This is a good example of thin bindings + making things safe and ergonomic on the OCaml side.

  • tweetnacl bindings this also all trivial bytes crunching so OCaml FFI.

  • OpenGL bindings these use a hand made custom generator to generate them from XML files describing the functions.

Note, there is the possibility of generating stubs, but I never used. dune should have built-in support for it.

8 Likes

I heard that there was a release six weeks ago.

8 Likes

@dbuenzli I don’t think the dune option will make the right library statically linked. We have to use the hack I have get from pg_query. Thus is because the linker will not include a libXXX.a if none of its symbols seems to be used.

I guess the main added value of the dune option is to generate the type declaration from the .h file(s).

Getting the right flags for linking can be tricky, whether dynamically or statically (see for example this comment by @jbeckford).

My point was rather that if you want to be able to statically link the C libraries you bind to with ctypes you have to generate the C stubs. If you are using the ctypes.foreign library (like tsdl does) this uses libffi which uses dynamic linking against the library you bind to.

1 Like

Ok, I have seen how this should work. I havn’t seen that Ctypes could work in two modes (static/dynamic).

However, the little example from the dune manual doesn’t work:

Error: Library "ctypes.stubs" not found.
-> required by _build/default/foo.exe
-> required by alias all
-> required by alias default
Error: Library "ctypes" not found.
-> required by _build/default/libfoo__c_cout_generated_types.exe
-> required by alias all
-> required by alias default

(Note: the library specific linker flags should be provided from a libfoo.pc file by pkg-config)

I have both ctypes and ctypes-build installed.

EDIT: I guess dune needs an old version of ctypes that is no longer available.

loyer@ak-serveur3:~/essai_ctypes$ dune build
File "dune-project", line 2, characters 14-20:
2 | (using ctypes 0.23)
                  ^^^^
Error: Version 0.23 of the ctypes extension is not supported.
Supported versions of this extension in version 3.16 of the dune language:
- 0.1 to 0.3
loyer@ak-serveur3:~/essai_ctypes$ vi dune-project 
loyer@ak-serveur3:~/essai_ctypes$ opam install ctypes.0.3
[ERROR] Package ctypes has no version 0.3.

I understand your confusion. (using X VVV) does not mean "use version VVV of package X, for this you use (depends (X (= VVV)).

using just tells dune to use the ctypes configuration that’s versioned 0.3 (as there were older versions of configuring ctypes within dune that we consider obsolete now). Thus you should write (using ctypes 0.3) and then it should work with the current version ot ctypes.

I have a concrete library that I wish to write bindings for, and I think this example could also be instructive for other users.

After building automerge-c, I have a build folder which contains (among other things):

├── include
│   └── automerge-c
│       ├── automerge.h
│       ├── config.h
│       └── utils
│           └── enum_string.h
├── libautomerge.a
└── objects
    └── automerge_core
        ├── automerge_core-add29bbe2a10d4dd.automerge_core.812158d38d571e73-cgu.0.rcgu.o
        └── compiler_builtins-d7e67037a88093ce.compiler_builtins.8d8aa21741ec8337-cgu.0.rcgu.o

Here is a minimal example of how to use the C code:

Now I have no idea how to proceed. Hope I am not asking for too much!

Thank you
I think the bindings are pretty trivial, and actually I also realised that they were already written and imported in the semgrep library. They bind a lot more than what semgrep exposes, so I should be able to just copy that.

I had seen both the Real World OCaml documentation and the dune documentation (I find the dune one slightly easier to follow). But what I am most worried about with Ctypes is if I need e.g. finalisers, or to tweek how it interacts with the garbage collector. I looked for these information, and while the question is raised, the answer is lacking. While I guess this is all minor, I didn’t quite know what the community was using at the moment given the state of things.

Cheers!