Tracing OCAML fatal error

Hi,

I am working on a parser using OCAML.
While parsing some files, I get two kinds fatal exceptions for “out of memory” with different exception text.

  1. Fatal error: out of memory
  2. Fatal error: exception Out_of_memory

Could you please tell me the difference between these two?

I can trace the 2nd exception using OCAMLRUNPARAM=b
However, no stack trace is printed in case of 1st error.

Could you please give me any suggestion on this?

Thank you in advance.

The difference is as follows: the first one immediately halted the program when malloc failed, whereas the second one reacted by raising an exception which was uncaught by your program. The first is the only option in the current GC design if the GC fails to obtain memory from the OS during minor collection. Thus the second one only happens for allocation that go directly to the major heap (e.g. bigger than 256 words, or performed for bigarrays or unmarshalling).

Having an exception, even uncaught, is important if you have resources that need clean-up. In addition, catching Out_of_memory makes sense if you know that you risk making a large allocation (which could happen with parsing), i.e. if you know that you are not so short on memory, but just trying to allocate more than the OS can handle.

In contrast, the immediate fatal error is very bad for fault-tolerant programming, as it gives no option to clean-up lock files, etc. But it is hard to solve at a runtime level: even if the GC was reworked to be able to raise an exception when a small allocation fails, it would be hard to know what to do at that point given that subsequent allocations are likely to fail too. Currently, the only solution is to limit memory use with an exception that is raised asynchronously before the memory is full (see Today's trick : memory limits with Gc alarms and [ANN] memprof-limits preview (and a guide to handle asynchronous exceptions)).

My suggestion is to see whether you have a bug (then try profiling to find the bug), or if limiting memory consumption is an inherent requirement of your application (then try to use the memory limits mentioned above).

3 Likes

Thank you Guillaume for this detailed reply.

What I understand from your example in Today's trick : memory limits with Gc alarms is that Out_of_memory is raised after memory is occupied after the specified limit is reached.

So, is this just to find out where exactly the error has been raised?
In the finally block, heap compaction is requested. So, can’t we resume the operation after shrinking the heap? Instead of just throwing Out_of_memory exception?

(It was a bit confusing on my part to re-use Out_of_memory, I should have called the exception differently.)

The purpose is indeed that this exception can be caught. Be careful, though, that some care must be taken to ensure that one does not end up in a corrupted state. See the guidelines from memprof-limits: https://gitlab.com/gadmm/memprof-limits/-/blob/master/doc/recovering.md.

It might be worth revisiting -why- you’re getting out-of-memory errors. With modern machines, unless you’re parsing truly giant files (or have a bug) you shouldn’t be getting out-of-memory errors. Limiting the memory your program can consume, is really a way of making it fail faster, so you can find the bug, whatever-it-is.

1 Like