Hello, today’s little known OCaml feature we will learn about are memory limits using Gc alarms.
Imagine you want to perform a memory-heavy computation, whose size of data set is not known in advance. (This post is inspired by the implementation of the SAT/SMT solver mSAT.) You would like this computation to fail gracefully if memory becomes insufficient. How can you do it?
The first thing you might think about is the
Out_of_memory exception. You simply run the computation inside an exception handler, expecting to receive this exception if memory ever becomes exhausted. However, for several reasons this is not reliable. First, the program is not always notified of failing allocations and gets killed instead much after, when it tries to access the memory for the first time (this behaviour of the operating system is called overcommitting). But this is the least problem, because you quickly figure out how to get rid of overcommitting, for instance by setting a hard memory limit to your process using
However, you then encounter the second issue: in OCaml, if a minor allocation fails, then the program will issue a fatal error instead of raising
Out_of_memory. That’s it:
Out_of_memory is only reliable for allocations that are guaranteed to go directly on the major heap (by default those bigger than 256 words, or those performed for bigarrays or unmarshalling). Not for repeated small allocations on the minor heap, which would represent most allocations in a functional program. (There are bug reports about this, but making
Out_of_memory reliable is reported to be non-trivial.)
GC alarms are callbacks which run at the end of major GC cycles. They are registered with
Gc.create_alarm. Now observe this cool trick:
let run_with_memory_limit limit f = let limit_memory () = let mem = Gc.(quick_stat ()).heap_words in if mem > limit / (Sys.word_size / 8) then raise Out_of_memory in let alarm = Gc.create_alarm limit_memory in Fun.protect f ~finally:(fun () -> Gc.delete_alarm alarm ; Gc.compact ())
That’s it, we simply raise
Out_of_memory from the Gc alarm. You can try it in the toplevel on a memory-hungry function:
let f () = let x = ref  in let rec f () = x := ()::!x ; f () in f ();; run_with_memory_limit 1000000000 f;;
This is not reliable with multi-threaded programs, because we do not know in which thread the exception must be raised. This is fine for programs like mSAT and many logic/language tools, because such tools are rarely limited by blocking IO, so they do not lose much by remaining single-threaded in the absence of true parallelism. (Haskell proposes per-thread allocation limits as a built-in feature, which manifests itself likewise as an asynchronous exception.)
Behind each Gc alarm, there is a finalised block. The finaliser calls the callback, and sets itself to run again at the next cycle. Thus, the memory limit (thus implemented) is an example of asynchronous exception usefully arising from a finaliser. Was such a use of Gc alarms expected when they were conceived? History did not tell.