Hi there, another blog post.
This time I discuss ideas for a new interface that helps localize the possibilities of errors when working with a Ctypes-style FFI. Comment below if you like/hate it please!
Matt
Hi there, another blog post.
This time I discuss ideas for a new interface that helps localize the possibilities of errors when working with a Ctypes-style FFI. Comment below if you like/hate it please!
Matt
Iâm not sure why we need to manage memory allocation on the OCaml side for Ctypes.
I can see a potential performance advantage, but a hand written C binding would copy the string internally in the stub if needed, and youâd have a safe way to deal with lifetimes: the CAMLparam
macro that registers it as a GC root, and you pass the OCaml value itself as an argument (not an unsafe pointer to something held by another OCaml value).
Itâd be good if the manual memory management was opt-in in Ctypes, and there was a default interface that is safer and doesnât require handling of raw pointers.
At least when the C stubs mode is used to generate C bindings. I can see why itâd be needed in the libffi
mode.
Small comment on your approach: you could try to use a tree instead of a list to avoid having to concatenate lists in bind. All we need here is to keep the values alive, so a tree of tuples might work too if you add another case to your GADT?
At least when the C stubs mode is used to generate C bindings. I can see why itâd be needed in the
libffi
mode.
This is indeed talking about the libffi
mode. I canât use the C stubs mode for the bindings Iâm making because I have to compile the OCaml code as an .so and then call a function that is passed to me in order to access Godot API function pointers.
Regardless, I am a little perplexed: If a C binding is passed an OCaml string or some other array/pointer type that has been allocated using allocate
, I have observed the crashes mentioned in the blog post. Are you saying in C stubs mode these crashes would not occur? My understanding was that they have to do with the OCaml garbage collector not knowing it canât throw away the off-heap memory they have a pointer to, since C might have a pointer to it to.
I can see a potential performance advantage, but a hand written C binding would copy the string internally in the stub if needed, and youâd have a safe way to deal with lifetimes: the
CAMLparam
macro that registers it as a GC root, and you pass the OCaml value itself as an argument (not an unsafe pointer to something held by another OCaml value).
Indeed, though these are not handwritten C bindings and so CAMLparam
is not available to me.
Regarding my approach (forgot to mention) I like the idea of using a tree. It would in fact be much easier than youâd think: since Dep : 'a -> dep
, I can simply apply Dep
to a tuple!
I mean that in C stubs mode Ctypes could generate code that is safer, i.e. more similar to how the hand-written function would look like.
But because C stubs mode and libffi mode has the same interface I donât think it is currently possible. (unless you abandon libffi
mode, but as you say that mode is useful too).
In particular itâd be good if some solution could be found in Ctypes itself, to ensure that the ptr
type holds references to the OCaml value that âholdsâ the memory allocated, and that this is registered as a root on the C side. Looking at the current implementation it does seem to contain some code to track this, but perhaps this is not complete when used in FFI mode.
Although if you use libffi mode then there isnât any âC sideâ to register the roots on, perhaps the @->
operator could do the kind of dependency tracking that you implemented in the living module? (e.g. build up a nested tuple of all the arguments, so we hold it all alive while the call is running?).
Although if you use libffi mode then there isnât any âC sideâ to register the roots on, perhaps the
@->
operator could do the kind of dependency tracking that you implemented in the living module? (e.g. build up a nested tuple of all the arguments, so we hold it all alive while the call is running?).
The problem is unfortunately worse than this. It can occur after the call returns, since C can return a pointer into the structure you pass in, or store the data you pass in, and so I donât think can be fixed in directly in Ctypes
without (likely unwanted) overhead.
Perhaps there should be a way to declare this at the time the functionâs signature is defined.
Not all functions work this way, it is only needed when the function returns some kind of pointer, and in that case it could be declared that it depends on one of the arguments when you define it:
let arg_returned = returned_ref <.... some type > in
... @-> arg_returned @-> returning (references arg_returned <type>)
(So you donât need to keep everything alive, just the argument that holds the memory).
ptr
already appears to be capable of tracking dependencies, the CPointer field has an âObj.t optionâ field, so the overhead may not be quite so big to keep track of this.
Perhaps this could be initially prototyped using your Living module, and eventually backported into Ctypes itself if we find a way to ensure lifetime safety.
This is definitely an option, but requires you to keep the return value alive then, and would only work with returning a pointer. For example, something returning unit
would not work with this approach, like a void store_string(const char* my_string)
function on the C side that simply stores my_string
in some static and opaque structure.
I like the idea of using Living
for prototyping though.
I thought thatâs what Ctypes fat pointers are for? (But I havenât used them so far so just guessing.)
In that case Iâd usually copy the string, unless I want to very precisely match the lifetime of C data structures with OCaml values (e.g. by using finalizers and Custom tags).
Youâre right that the lifetime annotations can become more complicated than just âthe return value uses thisâ, there should probably also be stores
in which case probably a copy should be made, unless the lifetime of the OCaml value is known to exceed it (e.g. perhaps referenced from a global).
The more I look at this it looks like weâd end up with something similar to Rustâs lifetime annotations.
But perhaps we could use modality (Jane Street Tech Blog - Oxidizing OCaml: Locality) instead of lifetime annotations.
If weâd wrap a C argument with (global argtype)
then the lifetime might be global (or at least exceed that of the current caller), and Ctypes should copy all arguments that arenât global.
On the other if we wrap it with
let scoped_arg = scope argtype in
let scoped_arg2 = scope ~ref:scoped_arg argtype2 in
let scoped_ret = scope ~ref:scoped_arg2 in
foreign "func1" @-> scoped_arg -> scoped_arg2 -> float -> returning scoped_ret
then argument2 needs to be alive as long as arg1 is alive, and arg2 needs to be alive as long as the return value is. This probably needs list/tree building similar to the Living module.
Although this is still very fragile, it relies on the user to declare the lifetimes (e.g. perhaps on a C manual page), and there are no safeguards if youâd get it wrong.
OTOH this could be an opt-in âhigh-performanceâ interface, and a safer, default interface could be to always copy the arguments (where that is possible, obviously you wouldnât want to copy a possibly gigabyte large arrayâŚ)
Iâve written a blog post on how living
currently handles this situation. Please take a look. [BLOG] A Tour of the Living Library -- A Safer FFI. I think itâs the best you can do without linear/affine types.