I believe OCaml uses mmap
to allocate memory but used to depend on malloc
. At least OCaml 4.02 seems to have used malloc
but I could not find in the Changes
file when this change was introduced and would be curious. I had asked about GC details at A lesson from Ruby's GC? and learned about OCaml now using mmap
, which shields it from assumptions made by malloc
.
It was implemented in 3.05 it seems.
And it looks like it was enabled by default:
This puzzles me because I see a huge impact by using jemalloc
instead of glibc malloc
in a program compiled with OCaml 4.02.
In fact, it is not true that OCaml uses mmap
instead of malloc
. The information from that other thread is incorrect. It does so currently only if you pass the option H
to OCAMLRUNPARAMS
.
It used to use mmap
by default in Linux until 2008, when mmap
support was removed a first time. mmap
was reintroduced in 4.03 in the patch to improve the GC’s latency, with a different (undocumented) switch, and not activated by default.
That would explain the impact of using jemalloc.
Looking at the implementation of OCAMLRUNPARAM=H it only uses hugepages (MAP_HUGETLB), which is problematic in some ways:
- hugepage allocations fail by default (at least on Ubuntu), root first has to reserve some hugepages
- once reserved hugepages cannot be used for other purposes
- hugepage allocations are not available in PV guests (I’d say not very relevant, if it weren’t for Dom0), where you only get 4KiB pages, not hugepages
- memory fragmentation can cause hugepage allocations to fail even on systems with otherwise plenty of RAM (hence the reservation, which is best done at boot time)
OCAMLRUNPARAM=H ocaml
fails like this:
mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = -1 ENOMEM (Cannot allocate memory)
write(2, "Fatal error: cannot allocate ini"..., 49Fatal error: cannot allocate initial major heap.
One has to first need to reserve some hugepages as root:
sudo sh -c 'echo 20 > /proc/sys/vm/nr_hugepages'
cat /proc/meminfo|grep -i huge
AnonHugePages: 2048 kB
ShmemHugePages: 0 kB
HugePages_Total: 20
HugePages_Free: 20
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 40960 kB
From which point any user can use them:
mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = 0x7fae43a00000
There are other ways of using hugepages (and they could be beneficial for memory-intensive applications), e.g. through madvise if you have transparent hugepages enabled (which has its own pros/cons, I haven’t measured). Transparent hugepages are just a hint though (through madvise), and fall back to regular 4K pages when hugepages are not available, so that might be a potential way to improve hugepage support in OCaml.
It would be interesting to measure performance and memory usage with regular malloc vs mmap though (perhaps rounded up to 1MiB or 2MiB).
mmap typically has a lot more overhead than malloc, and glibc’s implementation of malloc (used to) cache and “leak” mmaped pages in multithreaded programs (see https://bugzilla.redhat.com/show_bug.cgi?id=640286, http://codearcana.com/posts/2016/07/11/arena-leak-in-glibc.html). Other malloc implementations (tcmalloc, jemalloc, or perhaps even glibc by tuning the max arenas) are better at unmapping memory and returning it to the OS.