Exception vs Result

One rough edge I’ve found with open variant error types is that they don’t seem to work well with exhaustiveness checking. For example, the following compiles just fine despite the superfluous Error `bar case:

module M : sig
  val foo : (unit, [> `foo ]) Result.t
end = struct
  let foo = Error `foo

let () =
  match M.foo with
  | Ok () -> ()
  | Error `foo -> ()
  | Error `bar -> ()

We can change the type to val foo : (unit, [ `foo ]) Result.t, but only at the cost of unification with other errors:

# let l = [ M.foo; Error `bar ];;
Line 1, characters 23-27:
Error: This expression has type [> `bar ]
       but an expression was expected of type [ `foo ]
       The second variant type does not allow tag(s) `bar

Or you can do a more painful coercion:

let l = [ (M.foo :> (_, [ `foo | `bar ]) Result.t); Error `bar ]

If anyone has a better technique for working with these, I’m all ears!


I’m holding onto the ledge of not diverging into a generalized “discourse” on the merits of monadic styles by my fingertips :wink:

This is a very helpful clarification; I was not aware offhand of raise_notrace, thank you. I can certainly see its usefulness (especially for those rare cases when I might use exceptions for nonlocal returns and control flow that won’t escape some library boundary), but the prospect of code using this approach widely (and thus gutting exceptions’ utility for locating the source of errors) just for the hope of a smidgen more performance is unpleasant.

I think grounding any part of an exceptions vs. result debate/decision on performance is misguided in general. The ballgame for 99.5% of contexts is error-handling and sometimes recovery, and in these contexts, performance just isn’t a significant criterion in my experience.

I won’t dispute the truth of this in any formal sense, just because the operational semantics of those primitives are formally equivalent to their monadic reformulations, doesn’t mean they are equivalent experientially or with regard to programmer understanding/comprehension. While in the fray, programmers think they understand what’s happening in their program when using mutable state or non-local control mechanisms or IO or concurrency primitives, but we clearly do not. The awkwardness that sometimes comes with monadic approaches to these primitives is often a consequence of our being forced to directly confront their essential complexity, rather than letting them float about ambiently.

Just as a brief experience report, I had a fantastic outcome recently in refactoring a stdlib Result-based program of some complexity that also had some “side channel” state that had to be aggregated here and there. Handling the latter was incredibly awkward, using a ref off to the side…in my prior lisp life, I would have used a dynamically-scoped variable. But, BAP’s monads transformer library was waiting for me, and I was able to replace the mess with a very tidy State+Result monad, with zero performance penalty according to my benchmarking at the time of the switch.

1 Like

Unfortunately, that is not true. If you have compiled your OCaml program with option -g and if you have called Printexc.record_backtrace, the exception backtrace is materialized as soon as you call Stdlib.raise. (As already hinted, that is why Stdlib.raise_notrace exists in the first place.) There is no way to create this backtrace after the fact. Either it is created by Stdlib.raise or it does not exist.

1 Like

I don’t have a solution to the problem, but just pointing out that this plays with with exhaustivness checking: if you had failed to handle the foo then you would get an error as expected, so the errors are checked exhaustively. But yes, it does mean you can try to handle errors you’ll never actually get.

Not a great solution but you can specify the type, see below. Also, if you don’t necessarily care about the specific error but the type you can catch with match foo with | #M.err -> ...

module M : sig
  type err = [`foo]
  val foo : (unit, [> err ]) Result.t
end = struct
  type err = [`foo]
  let foo = Error `foo

let () =
  match M.foo with
  | Ok () -> ()
  | Error (`foo : M.err) -> ()
  | Error (`bar : M.err) -> ()
1 Like

TL;DR it appears that @silene is correct, that OCaml does record the backtrace (and pays a cost proportional to stack-depth) but this is only a small part of the cost of “materializing the backtrace” into the heap (which would need to be done to carry it around in a Result).

First, thank you for pointing this out: my understanding was based on seeing that exception backtraces aren’t reliably recorded, and the only way to really be sure of getting one, is to ask for it right after the exception is caught the first time.

But second [now, digging thru the source … ah, isn’t the source a dream to read thru?] I find that there are two steps: (1) copy the backtrace from the stack-frames into a “backtrace buffer” (which appears to be associated with the global state (and hence apparently static ?), and (2) copying that into a heap-allocated array for the application to use or discard, e.g. for adding to a Result.

I wrote a little test to check this out. I’ll insert the results, followed by the test and makefile. It’s possible I’ve done something wrong. The test:

(1) call record_backtrace
(2) with a flag, either calls get_backtrace() or does not, at the point it catches the exception
(3) and then K times, does a depth-N recursion, where it does a try/with block – within which it does a depth-M recursion, and inside that it does a failwith.
(4) so as we vary N, we are varying the depth of the stack
(5) varying M varies the depth of the stack between the try-catch and the raise (not as interesting)
(6) and we can control whether get_backtrace gets called to materialize the backtrace to the heap.
(7) the test is run as test1 <get-backtrace> <n> <m> <k>

It seems that indeed, the cost of a raise/catch is proportional to the depth of the stack. But also that get_backtrace adds 10x to that cost.


./test1 false run1 1000 10 100
run1@100: 0.000066
./test1 true run1 1000 10 100
run1@100: 0.052373
./test1 false run1 10000 10 100
run1@100: 0.000741
./test1 true run1 10000 10 100
run1@100: 0.054388
./test1 false run1 100000 10 100
run1@100: 0.006881
./test1 true run1 100000 10 100
run1@100: 0.077785

source (e1.ml)

Printexc.record_backtrace true ;;

let get_backtrace = ref true ;;

let depth pre f post n =
let rec drec n =
    pre () ;
    let rv =
      if n = 0 then f() else drec (n-1) 
    in post () ; rv
in drec n

let raiser () = 
  failwith "caught" ;;

let catcher pre post k =
    depth pre raiser post k
  with Failure _ ->
    if !get_backtrace then

let nop () = () ;;

let harness ~tag reps f =
  let stime = Unix.gettimeofday() in
  let () = f() in
  let etime = Unix.gettimeofday () in
  Fmt.(pf stdout "%s@%d: %f\n%!" tag reps (etime -. stime))

let bt = bool_of_string Sys.argv.(1) in
let tag = Sys.argv.(2) in
let n = int_of_string Sys.argv.(3) in
let m = int_of_string Sys.argv.(4) in
let reps = int_of_string Sys.argv.(5) in
get_backtrace := bt ;
harness ~tag:tag reps (fun () -> depth nop (fun () -> catcher nop nop m) nop n)


test:: all
	./test1 false run1 1000 10 100
	./test1 true run1 1000 10 100
	./test1 false run1 10000 10 100
	./test1 true run1 10000 10 100
	./test1 false run1 100000 10 100
	./test1 true run1 100000 10 100

all: test1

test1: e1.ml
	ocamlfind ocamlc -g -package fmt,unix -linkall -linkpkg -o test1 e1.ml

Not quite. There is no good reason for any code to ever be calling Printexc.get_backtrace in practice. This function should only be called when you actually want to pretty-print the backtrace for the user (which almost never happens). One should call Printexc.get_raw_backtrace instead (and then call Printexc.print_raw_backtrace, but only if actually needed). If you run your benchmark again, the timings you get will be mostly in the noise range, since Printexc.get_raw_backtrace only allocates a block and copies the backtrace into it (as you already noticed).

Now, back to the topic of Result. An implementation would not call Printexc.get_raw_backtrace, since the whole point of Result is to not raise exceptions, so there is no exception backtrace to record in the first place. Instead, it would call Printexc.get_callstack. So, the real question is: how much slower is it to call Printexc.get_callstack rather than just recording the backtrace for consumption by Printexc.get_raw_backtrace?

Fundamentally, the code is the same, with only two differences. First, there is a heap allocation (as does Printexc.get_raw_backtrace, for the same reason), but hopefully this will not be noticeable performance-wise. Second, exceptions record the backtrace only till the next exception handler. But for Printexc.get_callstack, it is up to the user to guess the length of the backtrace to be recorded.

So, to summarize, whether you use exceptions or Result, the overhead of backtraces will be exactly the same, as long as the Result code correctly estimates the size of the backtrace to record. If the backtrace requested by Result is systematically 10x the average length of an exception backtrace, then there will be a 10x overhead.

1 Like

FWIW, I have never done this in my result code. Once I’m making an API public, the errors should be descriptive enough to debug the cause without a backtrace.

On a philosophical tangent: IMO backtraces are not a thing a developer should want. In development it’s maybe useful but in production software, it’s a sign that the error handling is not being done well. It’s the ultimate breaker of abstraction. Of course, there may be pragmatic reasons to want a backtrace, but I think most of the arguments depend on living in a world where error handling is an after thought because it’s hard.


And a minor bit of clarification: Printexec.get_raw_backtrace's documentation admonishes "Same restriction usage than Printexc.print_backtrace", which declares:

If the call is not inside an exception handler, the returned backtrace is unspecified. If the call is after some exception-catching code (before in the handler, or in a when-guard during the matching of the exception handler), the backtrace may correspond to a later exception than the handled one.

Indeed, if you replace get_callstack in the little example I had in my original post with get_raw_backtrace, the returned backtrace has no slots in it, and so prints nothing (though I suppose “unspecified” could mean even worse things depending on your choice of compiler or target runtime).

I find these kinds of “shoulds” to be very counterproductive. Of course descriptive error messages are great, but all other things being equal: more, better information that helps a downstream developer (often yourself!) is always preferable over less, worse information.

This is demonstrably untrue. Bad things happen usually happen in unexpected ways, and the worst bad things definitionally happen in production. If this weren’t the case, you wouldn’t have massive industries built around production observability, APM, and incredibly sophisticated tooling in the largest ecosystems to help analyze production failures rooted in programming faults.

1 Like

I argue that this is because people are using error reporting tools that go around the static analysis in most languages. The time I need a backtrace in Ocaml is when some code threw an exception that I wasn’t expecting, or I made a human mistake and forgot to handle it. This very well could be confirmation bias but this issue has never happened to me when using result because the errors are always in my face. There is never a hidden error. So I don’t need a backtrace to figure out how I got to a point where the weird exception was raised.


A former friend once told me that he always checks every return-code. He wrote systems-level code (comms, storage) so sure. I’ve read and debugged massive piles of code in my job troubleshooting websphere dumpster fires. And sure, there were lots of times that code would “swallow” Runtime (unchecked) exceptions. But there were also lots of times that code would swallow checked exceptions, for want to anything else to do. The idea that somehow most application programmers actually know what to do with many of the exceptions they see … is somewhat unrealistic.

Let me say that differently: sure, for the OCaml community of hard-core researchers and (maybe) systems-jocks, expecting everybody to handle every exception and error is reasonable. But for the community into which you all expect OCaml to make inroads (unless you think that Reason/bucklescript/whatever are doomed), these assumptions are counterfactual. Backtraces exist and are valuable because people don’t handle errors well. That’s [that people don’t properly handle their error-conditios] a reality of all application programming in all environments, across all time.


Unless you are extremely careful (wrapping all functions that can potentially can throw an exception), exceptions can still leak and finding those without a backtrace is hard.

most application programmers actually know what to do with many of the exceptions they see … is somewhat unrealistic

indeed, that goes so far, that java code completion IDE’s factory defaults usually printstacktrace where they should wrap into a runtime-exception.The sunshine-attitude (we’ll bother with non-sunshine cases later. Maybe) is state of the art.

This seems to be the core of your counter and, to me, this logic seems circular: we need a tool that makes it hard to handle errors well (exceptions) because people don’t handle errors well. My own experience has suggested that this need not be the case but maybe the style I’ve chosen I just prefer. As I said before, I think we make errors hard to handle so people prefer not, so we should focus on making errors easier to handle

Yes, checked exceptions in Java do suck, but, IMO, for ergonomic reasons. They require a bunch of annoying boilerplate.

Agreed. And in those cases I do use a backtrace. My argument is really more about using something other than exceptions for error reporting and that tool removing the need for backtraces. Apologies for the lack of clarity.

1 Like

I don’t think you’ve understood what I’m saying:

  1. people are stupid. application programmers, especially so.
  2. most of the IT systems that run our world are written by application programmers.
  3. there are not enough adequately-skilled systems-programmer-level folks living to do all that work.
  4. and who would want to spend their lives writing transactions to implement the business rules to calculate net asset value for fifty different mutual funds with different rules for each one? [I bore myself just writing that down, and yet it’s a small part of the work of a particular multi-million-dollar mainframe app at a large custodial bank]
  5. So those applications programmers, and their stupidity, are unavoidable.
  6. application programmers write code that doesn’t handle all the exceptions it raises, sometimes because they don’t know what to do, sometimes because they’re lazy
  7. So they typically just add the “throws X, Y, Z” to all their method-signatures, or wrapper that exception in a RuntimeException and be done with it.
  8. and then they catch and print the stack trace at toplevel (when I was lucky, this was the case; when I was less lucky, they only printed the exception; when I was singularly unlucky, they’d silently swallow the exception).

Look: you’re a smart programmer. So here’s an exercise: pick some problem that requires you to think decently hard to solve it. Then get drunk, and solve that problem while continuing to drink. It’s a small example of what it’s like, being the people who -use- the systems we build, out in industry.

ETA: it might help to insert point #9: of course, most executions don’t throw exceptions, or this coding method wouldn’t work. But when errors occur, and those exceptions get thrown, then eventually someone smarter is brought in to figure out what the hell happened, and for that person, having backtraces is crucial.


You’re right: switching to get_raw_backtrace reduces the difference to noise. But that isn’t the only overhead of using the Result monad. Since this is dragging on, let me just bring out the benchmark I wrote a while back, when I observed all this “I’m having trouble with Lwt” activity for the Nth time: I wrote a little set of benchmarks that read char-by-char from a buffer or a file, and put that on Lwt, Async, and “direct style”. Just now, I put it on Result.

What these benchmarks show, is that there is a real cost to using these monads. If we’re going to be serious about catching and dealing-with all exceptions, we’re going to have to wrapper input_char (it can raise an exception) so these benchmarks aren’t unrealistic.

Using a framework that requires at every step a memory-allocation (like any of these monads) is a bad idea. Allocation-avoidance is a known optimization strategy in GCed languages: in the late 90s I modifed a commercial JVM so that “java -prof” would use a counter of “total bytes allocated” as the “clock” instead of the wallclock; it was remarkably useful as a tool for finding performance problems, because where you’re allocating you’re wasting time.

And this is something that caml-light encountered in its own history. The original reason that caml-light was a winner over SML/NJ wasn’t speed: it was allocation-rate. The original caml-light had two different optimizations that combined to radically reduce allocations for typical functional code:

  1. right-to-left (ok ok, “unspecified”) evaluation-order, so no need to make intermediate closures
  2. closures that are never returned are kept on-stack, not heapified.
    These had significant effects on performance and memory-footprint, even though in theory all that allocated memory was transient and should have been easy pickings for the GC.

OK, the results:

./async_read_buffer.opt /dev/zero 10485760
10485760 read in 1.180364 secs
./lwt_read_buffer.opt /dev/zero 10485760
10485760 read in 0.240806 secs
./rresult_read_buffer.opt /dev/zero 10485760
10485760 read in 0.070802 secs
./direct_read_buffer.opt /dev/zero 10485760
10485760 read in 0.028591 secs
./async_read_file.opt /dev/zero 10485760
10485760 read in 3.952371 secs
./lwt_read_file.opt /dev/zero 10485760
10485760 read in 0.362876 secs
./rresult_read_file.opt /dev/zero 10485760
10485760 read in 0.556473 secs
./direct_read_file.opt /dev/zero 10485760
10485760 read in 0.120328 secs

The link: https://github.com/chetmurthy/sandbox-public/tree/master/monads


Again, I think your response is predicated on “handling errors is hard” which is then used to argue for this system that is one of the reasons handling errors is hard. I’m just saying that we should try to find ways to make handling errors easier. I’m interested to see what algebraic effects does here.

I understand you disagree. The system I have been using seems to work for me, so mark it in your book as a little anecdote.

I think there is a parallel in this discussion to the dynamic type checking vs static types discussion. Pro-dynamic type checking folks say why do I need to add those types, it’s just annoying and I get the types right most of the time anyways, and hey programs written in statically typed languages have bugs anyways so what’s the point? Maybe my solution doesn’t scale well, I hope most people can agree that error handling needs improvement, though.

1 Like

This is not really a different technique, but a solution to reduce boilerplate due to type coercion (and make it less painful).

type a = [`foo]
type b = [`bar]
type c = [`baz]

let (let*) = Result.bind
let return x = Ok x

let foo x : (_, a) result = return x
let bar x : (_, b) result = return x
let baz x : (_, c) result = return x

let f x =
  (* define a cast function to make coercion less painful *)
  let cast x = (x :> (_, [a | b]) result) in
  let* x = cast (foo x) in
  let* x = cast (bar x) in
  return x
val f : 'a -> ('a, [ `bar | `foo ]) result = <fun>

let g x =
  let cast x = (x :> (_, [b | c]) result) in
  let* x = cast (bar x) in
  let* x = cast (baz x) in
  return x
val g : 'a -> ('a, [ `bar | `baz ]) result = <fun>

(* or you can use a local open to define local type aliases *)
let h x =
  let open struct
    type err = [a | b | c]
    type 'a t = ('a, err) result
  end in  
  let* x = (f x :> _ t) in
  let* x = (g x :> _ t) in
  return x
val h : 'a -> ('a, [ `bar | `baz | `foo ]) result = <fun>

Edit: More explicit, and I believe cleaner, way to use local open to define local type aliases.

related to some of the suggestions of Java’s checked exceptions - an interview with Anders Hejlsberg (the lead C# architect at the time):

I see two big issues with checked exceptions: scalability and versionability
Bruce Eckel : I used to think that checked exceptions were really great.
Anders Hejlsberg : Exactly. Frankly, they look really great up front, and there’s nothing wrong with the idea. I completely agree that checked exceptions are a wonderful feature. It’s just that particular implementations can be problematic. By implementing checked exceptions the way it’s done in Java, for example, I think you just take one set of problems and trade them for another set of problems. In the end it’s not clear to me that you actually make life any easier. You just make it different.