Finding GC problems in C bindings

lindig · March 7, 2019, 2:19pm

I have a C binding for a C library function where I suspect that it is not correct and causes a wrong value being returned in rare cases – which make me suspect it is GC related. Is there a general good strategy to find or trigger these kind of problems?
For example, during development it could be helpful to check the integrity of the heap after every assignment in the C code or to force garbage collections.

Problems I am suspecting in the code below:

Field(v, 0) should not be used but caml_modify(&Field(v, 0), ...) instead
Use caml_acquire_runtime_system() around the actual library call

It would be nice to actually demonstrate that this is wrong and trigger a problem.

CAMLprim        value
stub_statvfs(value filename)
{
    CAMLparam1(filename);
    CAMLlocal2(v, tmp);
    int             ret;
    int             i;
    struct statvfs  buf;

    ret = statvfs(String_val(filename), &buf);

    if (ret == -1)
        uerror("statvfs", Nothing);

    tmp = caml_copy_int64(0);

    /*
     * Allocate the thing to return and ensure each of the fields is set
     * to something valid before attempting any further allocations 
     */
    v = alloc_small(11, 0);
    for (i = 0; i < 11; i++) {
        Field(v, i) = tmp;
    }

    Field(v, 0) = caml_copy_int64(buf.f_bsize);
    Field(v, 1) = caml_copy_int64(buf.f_frsize);
    Field(v, 2) = caml_copy_int64(buf.f_blocks);
    Field(v, 3) = caml_copy_int64(buf.f_bfree);
    Field(v, 4) = caml_copy_int64(buf.f_bavail);
    Field(v, 5) = caml_copy_int64(buf.f_files);
    Field(v, 6) = caml_copy_int64(buf.f_ffree);
    Field(v, 7) = caml_copy_int64(buf.f_favail);
    Field(v, 8) = caml_copy_int64(buf.f_fsid);
    Field(v, 9) = caml_copy_int64(buf.f_flag);
    Field(v, 10) = caml_copy_int64(buf.f_namemax);

    CAMLreturn(v);
}

dbuenzli · March 7, 2019, 2:44pm

I had quite sucess in the past by simply calling Gc.full_major () after any binding call, usually a segfault would immediately occur after the offending one.

That looks indeed suspicious the idomatic way (cf rule 3) of doing this would be:

v = caml_alloc (11, 0); 
Store_field (v, 0, caml_copy_int64(buf.f_bsize));
...
CAMLreturn (v);

dbuenzli · March 7, 2019, 2:53pm

More precisely I would say that rule 6 is being violated here.

lindig · March 7, 2019, 3:19pm

Since a field already has a valid value, does this not force to use caml_modify as mandated by Rule 6?

dbuenzli · March 7, 2019, 3:26pm

I guess so and anyways this:

" Field(v, n) = w; is safe only if v is a block newly allocated by caml_alloc_small; that is, if no allocation took place between the allocation of v and the assignment to the field. In all other cases, never assign directly.

is strongly violated here since each of the caml_copy_int64 allocates.

dbuenzli · March 7, 2019, 3:27pm

You may also find it instructrive to see how Unix.stat is implemented in OCaml itself:

github.com

ocaml/ocaml/blob/9dda8fae43b05fbbde5c1435885e8a9eaec54eeb/otherlibs/unix/stat.c#L50


#endif


#ifndef EOVERFLOW
#define EOVERFLOW ERANGE
#endif


static int file_kind_table[] = {
S_IFREG, S_IFDIR, S_IFCHR, S_IFBLK, S_IFLNK, S_IFIFO, S_IFSOCK
};


static value stat_aux(int use_64, struct stat *buf)
{
CAMLparam0();
CAMLlocal5(atime, mtime, ctime, offset, v);


#include "nanosecond_stat.h"
atime = caml_copy_double((double) buf->st_atime
                         + (NSEC(buf, a) / 1000000000.0));
mtime = caml_copy_double((double) buf->st_mtime
                         + (NSEC(buf, m) / 1000000000.0));
ctime = caml_copy_double((double) buf->st_ctime

lindig · March 7, 2019, 3:32pm

Indeed, this would provide a template. It also uses caml_enter_blocking_section() which is missing in the code I have. When is it safe to omit it?

dbuenzli · March 7, 2019, 3:49pm

It’s always safer to omit. It’s better to add if the bracketed C code may be long and/or may block and is guaranteed not to interact with the OCaml runtime system.

Read the details in the manual.

cvine · March 11, 2019, 2:55pm

If you mean force the use of caml_modify in place of the use of Store_field, then I think no. Store_field calls caml_modify:

#define Store_field(block, offset, val) do{ \
  mlsize_t caml__temp_offset = (offset); \
  value caml__temp_val = (val); \
  caml_modify (&Field ((block), caml__temp_offset), caml__temp_val); \
}while(0)

Topic		Replies	Views
How to keep OCaml races bounded in space and time in the presence of C stubs? Learning multicore , gc	3	754	February 16, 2023
Relaxed rules for binding a C library? Learning multicore	8	761	October 25, 2022
Debug pointer corruption in custom blocks Community c	5	1201	July 30, 2019
Caml_modify(), use outside of block fields Learning	1	137	September 26, 2024
Confusion with CAMLParam in C bindings Learning ffi , c	15	384	February 2, 2025

Finding GC problems in C bindings

Related topics