The proposals are not naive.
We had tried domain-local allocation buffers (DLABs) to better utilise the minor heap area. In our experiments, we found that the memory hierarchy effects due to domains allocating into regions mmap’ed by other domains caused significant negative performance impact compared to the improvements gained by minimising false promotions. This effort did not make it into Multicore OCaml. We wrote up a retrospective here: Domain Local Allocation Buffers Addendum · ocaml-multicore/ocaml-multicore Wiki · GitHub
Given the experience with DLABs my suggestion for any GC experiments is to not start with ideas for a new algorithm. Instead
- Build a benchmark suite of representative benchmarks that you care about. Sandmark parallel benchmarks is just a tiny sliver in the very wide space. We don’t yet know how parallel programs in OCaml will be written. At the end of the day the benchmark suite is the one on which you will be measuring performance improvements.
- Collect metrics. If you think false promotions are a problem, let’s collect experimental evidence for this. At this point, I don’t know whether false promotions are a problem in parallel OCaml programs because I haven’t measured it. So I’m reluctant to comment on the details of the algorithms proposing to improve false promotions.
We can reuse some of these efforts across the different experiments that we want to do. We’re open to accepting contributions to Sandmark for example.
I should also say that any serious change in the GC algorithm will require significant software engineering effort if the plan is to upstream it. Memory corruptions in a parallel GC are a pain to debug.