Strangely high memory consumption with ocaml multicore

monniaux · May 3, 2022, 2:57pm

For testing OCaml multicore, I installed opam version 4.12.0+domains+effects and implemented quicksort using tasks.

To my surprise, even the purely sequential version has very high memory consumption if run from within a thread pool (Domainlibs) context: I see ocaml using 800 megabytes of RAM to sort an integer array of size 1 million if run within the following context:

let pool = Tsk.setup_pool ~num_additional_domains:3 () in
let run f = Tsk.run pool f in

The problem does not occur without this pool context.

Another surprising aspect: 256 gigabytes of virtual memory allocated for running anything inside ocaml.

If I run the computation over 4 threads, it dies before the end (but somehow the process returns 0).

So I wonder what is the current status of ocaml multicore, because what I’ve seen does not seem quite functional. Have I installed an incorrect version? What should I pick? (I’m using opam)

sid · May 3, 2022, 3:15pm

What about trying on trunk i.e. what will be OCaml 5.0 in a few months time. 4.12+domains+effects is supposed to be quite stable but is probably not getting many fixes and updates now.

Are you using int array or BigArray ?

I wouldn’t worry about this aspect too much. Many runtimes allocate huge amounts of virtual memory for various reasons. My guess is that each thread works in some pre-defined virtual memory space to avoid stomping on another thread when usage of the same memory space is not necessary. There could be other reasons.

As a point of comparison, for instance, my Haskell language server is showing that is occupying 1.0 TB (!) of virtual memory.

monniaux · May 3, 2022, 3:19pm

int array, not BigArray. An int array of length 1 million should take eight megabytes.

Trying on trunk? I tried installing it through opam but failed. Would you have a working command line for this?

sid · May 3, 2022, 3:45pm

Try this:

$ mkdir ocaml-500
$ cd ocaml-500
$ git clone https://github.com/ocaml/ocaml.git
$ opam switch create . --empty
$ cd ocaml
$ opam install .
$ ocaml --version
The OCaml toplevel, version 5.0.0+dev0-2021-11-05

Elaborated from ocaml/HACKING.adoc at trunk · ocaml/ocaml · GitHub . There are other approaches on that link also.

monniaux · May 3, 2022, 4:28pm

The situation is better with 5.0.0 beta trunk, but the process still dies early if 4 cores are used, and RAM consumption is just unbelievably high.

octachron · May 3, 2022, 4:32pm

With a recent version of opam

opam switch create 5.0.0+trunk

works.
Earlier version needs to enable access to the beta versions of OCaml by adding the ocaml-beta repository:

opam switch create 5.0.0+trunk --repo=default,ocaml-beta=git+https://github.com/ocaml/ocaml-beta-repository.git

octachron · May 3, 2022, 5:12pm

Running

let a = Array.init  1_000_000 Fun.id
let f () = Array.sort Stdlib.compare a
open Domainslib
let pool = Task.setup_pool ~num_additional_domains:3 ()
let () = Task.run pool f

gives me a peak memory use of 14 Mbytes. It seems that the issues does not come from the code that you shared.

monniaux · May 3, 2022, 5:43pm

See source code. When array size reaches about 1 million, RAM consumption is about 1 gigabyte. This is just a vanilla handwritten sequential quicksort procedure called from the main domain in a pool with 4 domains.

The same procedure works perfectly if one sets the pool to 1 domain only, and memory consumption is just sixty megabytes.

octachron · May 4, 2022, 8:02am

Thanks for the code!

The GC behavior does seem interesting: with 4.14, the memory use peaks around 100MB, it goes down to 50MB with 5.0 without domains and goes up to 900MB once a domain is created.

It looks like the GC is struggling to scan fast enough all integer arrays that are allocated in the major heap. Typically either switching to float arrays or adding a Gc.full_major () before running a test for another array size fixes the memory consumption issue.

bluddy · May 4, 2022, 8:28am

Sounds like a good argument for a ‘no scan’ bit in the header.

But you said ‘4.14’ which confuses me. Isn’t 4.14 supposed to not include multicore?

octachron · May 4, 2022, 8:48am

4.14 does not support multicore, but we are speaking of the behavior of a sequential test of sequential sort for increasing array lengths, which does not require parallelism to run.

bluddy · May 4, 2022, 9:07am

Oh sorry I misread your post.

In any case, should GC parallel scanning be impacted so heavily? Surely there must be a performance comparison benchmark of single-domain to multi-domain GC scanning?

monniaux · May 4, 2022, 12:36pm

Indeed, the problem disappears if the main loop calls the GC. In contrast, the huge memory growth still occurs if using Bigarrays, so it’s not an issue of scanning a huge array of normal integers to check if they are pointers.

Topic		Replies	Views
Minimum requirements to "Get Started" with OCaml Ecosystem opam	4	803	August 7, 2022
Multicore OCaml: February 2021 Community multicore , multicore-monthly	3	11491	March 19, 2021
Dune memory consumption with multiple jobs Ecosystem dune	2	377	November 15, 2023
Multicore OCaml: November 2020 Community multicore , multicore-monthly	1	3242	February 6, 2021
How to optimize the memory consumption in OCaml? Learning performance , datastructures , data-structure	8	734	April 1, 2024

Strangely high memory consumption with ocaml multicore

Related topics