5.x GC compaction pause duration mental model

We are considering adding calls to Gc.compact () at times of lower activity in our program. Is it still true that in 5.x compaction stops the world? If yes, what impacts the duration of the pause? Just the count of unreachable or reachable values? Does any of you have some examples where you profiled how long compaction took and how it impacted the running program?

We haven’t done any profiling yet because it’s a bit hard for us to reproduce the usage patterns from production. I am considering yoloing it for some part of the traffic and measuring on production but wanted to ask first.

My understanding is that Gc compaction stops the world and then traverses the whole major heap (so the time should be proportional to the memory used by the program, minus off-heap memory). (This is for single-domain programs: if there are several domains they each process in parallel the fraction of the shared heap they own.)

Does any of you have some examples where you profiled how long compaction took

I don’t. You could try to measure compaction time on a smaller instance / test workload, and then scale that by the memory-usage ratio between your test instance and your production instances.

how it impacted the running program?

We don’t know well how much fragmentation there is with the OCaml 5 GC – probably not that much.

I would expect notable benefits from compaction if there is a state/phase change in your program, from a state that uses a lot of memory to a new steady state that uses much less memory – then compaction will be able to improve locality and release memory to the OS.

If you are in the middle of steady state that uses a stable-ish amount of memory, even if there are cycles with much-lower memory usage at the beginning and end of each iteration, then there may not be that much to be gained. But I don’t know for sure (and I’m no GC expert), and it’s worth giving it a try!

1 Like

I would expect notable benefits from compaction if there is a state/phase change in your program, from a state that uses a lot of memory to a new steady state that uses much less memory – then compaction will be able to improve locality and release memory to the OS.

I wanted to mention that I expected something like this. I transitioned an older project of mine to OCaml 5 (5.3 to be precise), and I noticed that the memory usage would steadily climb over time (which I never noticed in OCaml 4), which looked like a memory leak to me, so I added a Gc.compact () at the end of the phase that usually allocates a lot. This seemed to work, but the new problem was that over time, the compactions would take longer and longer. I then found this blog post and used their suggestion, and that brought the memory usage under control, without any compaction.

1 Like

We (upstream) are aware of “GC pacing issues” with 5.x. A symptom that someone’s workflow may be affected by those pacing issues is that they need an unnaturally low space-overhead setting (the o setting of OCAMLRUNPARAM; unnatural is noticeably below 80%) to avoid large increase in memory consumption. There is work ongoing to fix these issues by @stedolan, @NickBarnes and @damiendoligez – but this will not be ready for the upcoming 5.4 release, hopefully for 5.5. Hopefully, when those issues are fixed, the tricks of the semgrep blog post will be unnecessary (… and in fact they may become counter-productive as the GC would over-react and spend too much time collecting memory).

2 Likes

You can get measurements cheaply using Runtime Events and olly trace tool, which will give accurate timings for how long the compaction takes and works across multiple domains. The fuchsia files are readable in perfetto or Chrome’s tracing tool.

2 Likes