Thanks so much for running these and it’s good to see it had an impact.
It took a little while to get these branches running in the benchmarking suite (and there’s still an abort in one of the parallel benchmarks I need to investigate) but there’s some preliminary sequential numbers here:
It seems the performance impact of not batching pool allocations is fairly small. The only difference between pool_release
and pool_release_cycle
is when pools are released. The former does so immediately, the latter only at the end of a major cycle.
I think there’s probably a good argument for releasing pools when done with them. I’m also pondering whether we need to mmap the pools or whether malloc might be sufficient.