OCaml heap "fsck" and forcing collection of unreachable objects

(If you want to read the background to this, it’s about a bug we found when porting libguestfs to OCaml 5. You can read all about it here: The Libguestfs May 2023 Archive by thread The shorter summary is below …)

When writing OCaml C bindings we have in the past found it useful to call Gc.compact () in our test programs. Although not intentional, we found that this is a useful way to do a kind of “fsck” over the OCaml heap. In practice it discovered many bugs over the years in our C bindings.

Also, again when testing, we have wanted a way to force collection of all unreachable objects. Gc.compact happens to do this as well, in OCaml 4. See for example: ocaml/tests/test_140_explicit_close.ml · master · nbdkit / libnbd · GitLab

However in OCaml 5 a couple of things happened. Firstly compaction isn’t really implemented at all (Update Gc module to reflect multicore changes · Issue #11812 · ocaml/ocaml · GitHub).

Secondly Gc.compact doesn’t even free all the objects. It finishes the current major cycle. Gc.full_major was modified to do 3 x major cycles which is, according to a comment, enough to free objects. The documentation still implies that Gc.compact is a superset of Gc.full_major, but it isn’t. See also: [Libguestfs] [PATCH libguestfs 1/2] ocaml/implicit_close test: collect all currently unreachable blocks

So my question is, is there, or should there be new calls which actually let us do the things we really want to do, ie. “fsck”, and forcing collection of all unreachable objects?

This is partially known (which is to say I didn’t think it looked right, and your experience is suggesting it definitely isn’t!), cf. comment in ocaml/ocaml#11816). Note that compaction is likely to be restored in 5.2 (cf. ocaml/ocaml#12193).

But, nota bene: only for blocks of size < 128 words. Blocks of size >= 128 will not be moved.

Cheers,
Nicolas

1 Like

Will larger blocks be freed directly without compaction? I’m more interested in OCaml eventually returning memory to the OS (OCaml applications aren’t the only ones running on the system, or there could be more than one OCaml application running!) than compaction per se.
And for finding certain kinds of bugs returning memory to the system might suffice (if you have an out of date pointer somewhere it’ll point to unmapped memory and crash).

For OCaml 4 try OCAMLRUNPARAM=c: it will do a forced garbage collection on exit (useful to also test the finalizers of other global objects, or for very simple tests/examples that only use values that are always reachable while the test is running), I found some bugs in C bindings using that.