While working on reducing the memory pressure from unreferenced bigarrays, I wanted to avoid doing a complete major collection when the pressure gets too high by using
The following example seems to expose a bug in
GC.major_slice, as it makes the gc stop completely and the program raises Out_of_memory after a short while
let rec alloc = function
| 100000 ->
Printf.printf "Allocated words: %d\n%!" (Gc.stat ()).Gc.live_words;
| n ->
Bytes.create 2048 |> ignore;
Gc.major_slice 0 |> ignore;
alloc (n + 1)
let () = alloc 0
(Tested on both 4.06.1 and 4.07-trunk).
If the line 'Gc.major_slice 0 |> ignore;` is removed, the program runs for ever (as expected). It seems to be related to that bytes >= 2K are allocated on the major heap, so in this example no allocations are never made on the minor heap. Doing an allocation on the minor heap in the tight loop also makes the problem go away.
I assume this is not expected behavior, but I was hoping that someone with knowledge into the internals of the GC could explain whats going on, and when or not to use major_slice.
That’s a race condition triggered by the specificness of your code. Not to say this is not a bug, we need to file it to Mantis.
What is specific about your code, is that it is a loop with no allocations on the minor heap. Under normal circumstances, the allocation function will detect that the allocation was made on the major heap and trigger the GC cycle in a proper way (i.e., via
caml_gc_dispatch, that will clear the minor heap and fire a major slice). However, in order not to trigger a GC cycle after every allocation on the major heap, it is only done after a certain amount of words were allocated but wasn’t taken into account by the GC. This value is stored in the
caml_allocated_words variable, that tracks the number of words allocated after the last cycle.
So far so good, but when the Gc.major_slice is invoked, it can’t start the major cycle, because the minor heap is not empty (when it is called from
caml_gc_dispatch and the major cycle is requested the dispatch function will make sure that the minor heap is cleaned). Thus major_slice does nothing … except resetting the
caml_allocated_words to zero. As a result, we have a race condition. The number of allocated words is constantly equal to
258 on each cycle (256*8+2). As just after allocation it is reset to 0.
The fix would be to keep the counter if the major slice wasn’t able to perform any work at all, or the major_slice should clear the minor heap for itself.
For the reference, the details of the bug with links to code:
A place where the GC cycle should be requested (the condition is always
caml_allocated_words is always 258 that is much lesser than the
The place in the
caml_major_collection_slice function (which is the C implementation of the
Gc.major_slice function) where the counter is reset. It is always reachable and is a post-condition of this function.
No actual work is done by the slice (i.e., the
start_cycle function is never called), because the minor heap is not empty.
Thank you for a detailed description of the problem. I will see if I get time to create a patch and test your suggested solution.
In the meantime, I have filed a report in mantis: https://caml.inria.fr/mantis/view.php?id=7813