[BLOG] OCaml Backtraces on Uncaught Exceptions, by OCamlPro

OCamlPro · April 25, 2024, 1:14pm

Greetings Cameleers,

Here’s another one of our heads up about our latest blog release!

Today’s topic is about an unintentionally hidden feature of the OCaml dev
environmment: backtraces on uncaught exception!

We believe this will be old news to the veteran OCaml devs but could be of much
use to the newer Cameleers out there!

Hopefully, you will learn a thing or two from reading this short article, we welcome all feedback in this very thread, thank you for reading!

Kind regards,
The OCamlPro Team

sim642 · April 29, 2024, 7:41pm

If you absolutely need to catch all exceptions, a last resort is to explicitely re-raise “uncatchable” exceptions:
let this_is_a_last_resort =
  try .. with
  | (Sys.Break | Assert_failure _ | Match_failure _) as e -> raise e
  | _ -> ..

I would add to this “uncatchables” list:

Out_of_memory and Stack_overflow, which come from the OCaml runtime. For example, Lwt wants to avoid catching these specifically: Dont catch ocaml runtime exceptions by raphael-proust · Pull Request #964 · ocsigen/lwt · GitHub.
Undefined_recursive_module _, although it’s relatively obscure.

yawaramin · April 29, 2024, 7:59pm

The standard library should probably provide a predicate that defines whether an exception is ‘uncatchable’ or not. Eg apparently Invalid_argument is not supposed to be caught either.

AltGr · May 15, 2024, 8:05am

Out_of_memory and Stack_overflow

Indeed, I may add a paragraph about those. The thing is that, indeed you never want to catch them, but you probably don’t even want to run finalisers on them (e.g. try ... with ... | e -> finalise(); raise e) because that’s very likely to fail again. Best to not attempt to fail gracefully on those, but as you point out, it’s even worse to catch them “by accident”.

mbarbin · May 15, 2024, 9:40pm

I haven’t yet had the chance to read the blog post, but I’ve been following the ensuing discussion here (Hi Louis!). Regarding the comments about the handling of “uncatchable” exceptions, I wanted to bring up 3 examples from commonly used libraries:

Rresult.R.trap_exn (latest = 0.7.0):

  let trap_exn f v = try Ok (f v) with
  | e ->
      let bt = Printexc.get_raw_backtrace () in
      Error (`Exn_trap (e, bt))

Base.Or_error.try_with (latest = v0.17.0):

let try_with ?(backtrace = false) f =
  try Ok (f ()) with
  | exn -> Error (Error.of_exn exn ?backtrace:(if backtrace then Some `Get else None))
;;

Stdlib.Fun.protect (latest = 5.2)

let protect ~(finally : unit -> unit) work =
  let finally_no_exn () =
    try finally () with e ->
      let bt = Printexc.get_raw_backtrace () in
      Printexc.raise_with_backtrace (Finally_raised e) bt
  in
  match work () with
  | result -> finally_no_exn () ; result
  | exception work_exn ->
      let work_bt = Printexc.get_raw_backtrace () in
      finally_no_exn () ;
      Printexc.raise_with_backtrace work_exn work_bt

I haven’t delved deeply into these examples, but at first glance, it seems like all three might exhibit the characteristic of catching all exceptions, including those that are considered “uncatchable” as discussed here. Do you concur? Is this really an issue in practice?

Possibly related discussions:

By the way, this may be off-topic for now, but I’m curious about how these exceptions would interact with an OCaml’s type system that gained the ability to express the exceptions a function can raise as part of its type. What would be the status of exceptions that are designed not to be caught?

dbuenzli · May 15, 2024, 10:43pm

Ooops. Not good. I have to admit I no longer use Rresult and consider it somehow deprecated (the docs advise to prefer the Result module). But if someone files a bug I will change this.

I wouldn’t concur here, protect doesn’t swallow any exception, they all get eventually re-raised. They may however be wrapped in a Finally_raised exception.

A simple example is the toplevel. In the top-level you want to be able to stop computations with C-c. If you catch the Break exception that’s no longer possible.

Regarding catching Stack_overflow without meaning to do so, it can significantly lengthen a possible debugging session… been there.

edwin · May 15, 2024, 10:52pm

Depends on the program. I’d argue that it should be caught as late as possible.
If you have a daemon that handles many API calls then crashing the entire daemon because one API call triggered Invalid_argument is not desirable (might even be a DoS security bug). Instead the exception should be caught, logged, and prevented from interfering with the handling of other API calls.
The point to catch the exception is the main API call handler.

This is very different from Out_of_memory where you’ve exhausted a resource, and your exception handler may not have enough resources to succeed either (which is discussed in more detail in the linked discussion thread).

In fact to find the place to put the handler for these exceptions you can probably follow the rule of thumb in the linked thread, everything else should propagate the exceptions:

dbuenzli · May 15, 2024, 10:57pm

I’d say the right way to put it is that Invalid_argument is not meant to be triggered. A good example is Array.get, if you may want to compile with -unsafe.

edwin · May 15, 2024, 11:04pm

The standard library is almost consistent on only using Invalid_argument for bounds errors (or other errors that can be fatal in some circumstances), but there are a few functions that use it for input validation (although other places use Failure for that).

E.g. bool_of_string raises Invalid_argument, but float_of_string raises Failure. I think bool_of_string should’ve raised Failure too, but too late to change that now.

Chet_Murthy · May 15, 2024, 11:14pm

Sure, in your example, if that “Invalid_argument” can be completely handled and repaired (assuming any repair is needed) at the API dispatch handler, then that’s the right place to do it. The question is whether it can be completely handled and repaired. If there’s any doubt, then that’s the wrong place to handle it.

Going further, if this is critical system, then it should be replicated, so that even if handling that fault requires restarting the process, the service as a whole remains available. And last, if the fault is sufficiently frequent that even then it affects system availability, then that by definition means it’s sufficiently frequent that it can be quickly debugged and eliminated.

The key teaching of fault-tolerant systems is that you have to “repair the fault completely” as soon as possible after it occurs. That’s the key teaching. You get to define what “completely” means, but if you screw that up, then you’ve built a fault-prone system.

dbuenzli · May 15, 2024, 11:22pm

Yes bool_of_string is the joke in the stdlib. For some time the difference between Invalid_argument and Failure was not crystal clear but this was clarified by this commit.

mbarbin · May 16, 2024, 10:19am

Got it. I initially brought up Stdlib.Fun.protect because I thought it might contradict @AltGr 's recommendation:

Louis (I may have misunderstood your recommendation), seeing the second layer of try with in Stdlib.Fun.protect and finaliser’s exceptions wrapping (Finally_raised) - does this address your concerns about running finalisers on “not-to-be-caught” exceptions?

gadmm · May 16, 2024, 1:37pm

The purpose of Finally_raised from Fun.protect was indeed to convert any exception into an “uncatchable” one. I think it was the first time that something in the compiler distribution was designed explicitly around the distinction between errors and “uncatchable” exceptions.

There is no built-in predicate to check whether an exception is an error or is “uncatchable”. Programs themselves end-up implementing this predicate, e.g. in Coq: coq/lib/cErrors.ml at e1223865a913d6f152f36eb561c475c82f40d4b5 · coq/coq · GitHub. Coq devs think that this forces them into making whole-program assumptions and therefore denotes a missing language abstraction.

I say more about this distinction, why I don’t like the name “uncatchable”, and how the language could evolve, in my talk at OCaml 2021^[1].

As per the linked discussion about Out_of_memory, this one is actually only reliable as a (catchable, recoverable) error, where your system is not actually out of memory.

Stack_overflow is a bit apart from other “uncatchable” exceptions, but I would not go as far as saying that it should be ignored by Fun.protect. If Fun.protect is not part of the recursion that loops, then you can still expect it to work.

Probabilistic resource limits using StatMemprof - Watch OCaml — unfortunately they copied an old title that did not reflect the contents well. ↩︎

yawaramin · May 23, 2024, 10:58pm

Admittedly I haven’t watched your talk and don’t know what you recommend, but in Scala they are called ‘fatal’ exceptions.

Topic		Replies	Views
Catching non-fatal exceptions? Learning	6	515	April 20, 2024
Should out-of-memory ever be caught in libraries/framework? Ecosystem exceptions	27	1573	March 27, 2024
Exception vs Result Learning exceptions	60	9454	September 15, 2023
Specific reason for not embracing the use of exceptions for error propagation? Ecosystem	40	7807	December 24, 2018
Updating the “Error Handling” tutorial Community user-feedback , ocamlorg , tutorial	27	2347	May 19, 2023

[BLOG] OCaml Backtraces on Uncaught Exceptions, by OCamlPro

Related topics