Hello, today’s little known OCaml feature we will learn about are memory limits using Gc alarms.
Imagine you want to perform a memory-heavy computation, whose size of data set is not known in advance. (This post is inspired by the implementation of the SAT/SMT solver mSAT.) You would like this computation to fail gracefully if memory becomes insufficient. How can you do it?
The Out_of_memory
exception
The first thing you might think about is the Out_of_memory
exception. You simply run the computation inside an exception handler, expecting to receive this exception if memory ever becomes exhausted. However, for several reasons this is not reliable. First, the program is not always notified of failing allocations and gets killed instead much after, when it tries to access the memory for the first time (this behaviour of the operating system is called overcommitting). But this is the least problem, because you quickly figure out how to get rid of overcommitting, for instance by setting a hard memory limit to your process using ulimit
.
However, you then encounter the second issue: in OCaml, if a minor allocation fails, then the program will issue a fatal error instead of raising Out_of_memory
. That’s it: Out_of_memory
is only reliable for allocations that are guaranteed to go directly on the major heap (by default those bigger than 256 words, or those performed for bigarrays or unmarshalling). Not for repeated small allocations on the minor heap, which would represent most allocations in a functional program. (There are bug reports about this, but making Out_of_memory
reliable is reported to be non-trivial.)
GC alarms
GC alarms are callbacks which run at the end of major GC cycles. They are registered with Gc.create_alarm
. Now observe this cool trick:
let run_with_memory_limit limit f =
let limit_memory () =
let mem = Gc.(quick_stat ()).heap_words in
if mem > limit / (Sys.word_size / 8) then raise Out_of_memory
in
let alarm = Gc.create_alarm limit_memory in
Fun.protect f ~finally:(fun () -> Gc.delete_alarm alarm ; Gc.compact ())
That’s it, we simply raise Out_of_memory
from the Gc alarm. You can try it in the toplevel on a memory-hungry function:
let f () =
let x = ref [] in
let rec f () = x := ()::!x ; f () in
f ();;
run_with_memory_limit 1000000000 f;;
Caveat
This is not reliable with multi-threaded programs, because we do not know in which thread the exception must be raised. This is fine for programs like mSAT and many logic/language tools, because such tools are rarely limited by blocking IO, so they do not lose much by remaining single-threaded in the absence of true parallelism. (Haskell proposes per-thread allocation limits as a built-in feature, which manifests itself likewise as an asynchronous exception.)
Implementation
Behind each Gc alarm, there is a finalised block. The finaliser calls the callback, and sets itself to run again at the next cycle. Thus, the memory limit (thus implemented) is an example of asynchronous exception usefully arising from a finaliser. Was such a use of Gc alarms expected when they were conceived? History did not tell.