Lwt - how to catch exceptions raised by signal handlers?

gaborigloi · July 18, 2017, 4:38pm

I have some Lwt code that opens sockets, and I’d like to make sure that they get closed when the program is interrupted by SIGINT or SIGTERM. I’ve registered the handlers like this:

  let register_handlers () =
    Lwt_unix.on_signal Sys.sigterm (fun _ -> failwith "Caught SIGTERM");
    Lwt_unix.on_signal Sys.sigint (fun _ -> failwith "Caught SIGINT")

And the exception is succesfully raised when I press CTRL+C: (Failure "Caught SIGINT"). However, the cleanup code in the Lwt.finalize and Lwt.catch blocks does not catch this exception, and does not get executed for some reason. Is there a way of catching the exceptions raised by the handler in Lwt?

My code looks roughly like this, but I tried calling register_handlers from different places:

Lwt_unix.accept sock
>>= fun (fd, _) ->
Lwt.finalize
  (fun () -> register_handlers (); ... <handle client>)
  (fun () -> Lwt_unix.close fd)

I’d be grateful for any help you can provide

antron · July 18, 2017, 5:35pm

It does seem like a bug:

open Lwt.Infix

let () =
  Lwt_unix.on_signal
    Sys.sigint
    (fun _ ->
      failwith "Caught SIGINT")
  |> ignore;

  let p =
    Lwt.finalize
      (fun () ->
        Lwt_unix.sleep 2. >>= fun () ->
        raise Pervasives.Exit)
      (fun () ->
        prerr_endline "handled exception";
        Lwt.return ())
  in

  Lwt_main.run p

(* ocamlfind opt -linkpkg -package lwt.unix foo.ml && ./a.out *)

This behaves differently depending on whether you interrupt it with Ctrl+C, or let it raise Exit. Also, OCaml signal handling is somewhat strange. I don’t remember the full details at the moment. I opened an issue for addressing it:

https://github.com/ocsigen/lwt/issues/451

gaborigloi · July 18, 2017, 5:55pm

Thanks a lot for reporting the issue!

antron · July 18, 2017, 6:11pm

Thank you, you reported it

gaborigloi · July 19, 2017, 8:49am

Thanks a lot for looking into this issue!
While trying to find a workaround, I realized that if I have multiple signal handlers that raise an exception, only one of them gets executed:

open Lwt.Infix

let () =
 Lwt_unix.on_signal
    Sys.sigint
    (fun _ ->
       print_endline "Caught SIGINT 1"; failwith "SIGINT 1")
  |> ignore;
  Lwt_unix.on_signal
    Sys.sigint
    (fun _ ->
       print_endline "Caught SIGINT 2"; failwith "SIGINT 2")
  |> ignore;

  let p =
    Lwt.finalize
      (fun () ->
        Lwt_unix.sleep 2. >>= fun () ->
        raise Pervasives.Exit)
      (fun () ->
        prerr_endline "handled exception";
        Lwt.return ())
  in

  Lwt_main.run p

Output:

> ocamlfind opt -linkpkg -package lwt.unix foo.ml; and ./a.out
^CCaught SIGINT 1
Fatal error: exception Failure("SIGINT 1")

When I don’t raise an exception, both handlers get executed, but the program doesn’t stop:

open Lwt.Infix

let () =
 Lwt_unix.on_signal
    Sys.sigint
    (fun _ ->
       print_endline "Caught SIGINT 1")
  |> ignore;
  Lwt_unix.on_signal
    Sys.sigint
    (fun _ ->
       print_endline "Caught SIGINT 2")
  |> ignore;

  let p =
    Lwt.finalize
      (fun () ->
        Lwt_unix.sleep 2. >>= fun () ->
        raise Pervasives.Exit)
      (fun () ->
        prerr_endline "handled exception";
        Lwt.return ())
  in

  Lwt_main.run p

Output:

> ocamlfind opt -linkpkg -package lwt.unix foo.ml; and ./a.out
^CCaught SIGINT 1
Caught SIGINT 2
handled exception
Fatal error: exception Pervasives.Exit

So I think I have to raise an exception from the handler, to stop the program. So this means that I can have only one handler. So I came up with the following really ugly workaround, which seems to work:

  let cleanups = Hashtbl.create 4;;

  let cleanup_id = ref 0;;

  let register_cleanup_handler () =
    let handler s = Hashtbl.iter (fun _ c -> c ()) cleanups; failwith "Caught signal %s" s in
    Lwt_unix.on_signal Sys.sigterm handler |> ignore;
    Lwt_unix.on_signal Sys.sigint handler |> ignore

  let with_cleanup f cleanup =
    let i = !cleanup_id in
    incr cleanup_id;
    Hashtbl.add cleanups i cleanup;
    Lwt.finalize f (fun () ->
          Lwt_io.print "unregistering handler\n" >>= fun () ->
          Lwt.wrap (fun () -> Hashtbl.remove cleanups i) >>= fun () ->
          Lwt_io.print "unregistered handler\n")

The only issue is that I cannot use Lwt functions in the handler and in the cleanup functions .

raphael-proust · August 29, 2017, 4:56am

I investigated the issue. Details, proposal for change, and workaround available on this Github comment:

Note that signal handlers can’t be made to automaticallly send an exception wherever you want because Lwt has no way of knowing where it is appropriate. There is a workaround using waiters. See comment linked above.

gaborigloi · September 24, 2017, 10:12am

Thanks for looking into the issue! I’ve realized that the example in my original post isn’t a good one, because both Linux and Windows automatically close open file descriptors, therefore I don’t have to worry about cleaning these up when the program terminates, I only have to ensure that the program isn’t leaking open file descriptors while it’s running.

However, I’ll probably still need to free some resources when the program is stopped with a SIGTERM. I’ve realized that the situation becomes more complicated sadly when I have two nested finalizers.

If I use the wakener only in the inner block, then both cleanups get executed in all cases. However, when I press CTRL+C immediately after startup, I still have to wait for the first sleep to finish - what I originally wanted is to terminate the program immediately and run all the cleanups in all finalizers.
If I use the wakener with the outer block, the cleanup for the inner Lwt.finalize does not get executed.
If I use the same wakener for both finalize blocks, then it works as I intended - it exits immediately and calls both cleanup functions:

(* compile with
   ocamlfind ocamlc -o prog -linkpkg -package lwt -package lwt.unix prog.ml
*)
open Lwt.Infix

let () =
  let w,u = Lwt.wait () in
  Lwt_unix.on_signal
    Sys.sigint
    (fun _ ->
       Lwt.wakeup_exn u (Failure "Caught SIGINT"))
  |> ignore;

  let p =
    Lwt.finalize
      (fun () ->
         (Lwt_io.eprintl "block 1" >>= fun () ->
          Lwt_unix.sleep 2. >>= fun () ->
          Lwt.finalize
            (fun () ->
               (Lwt_io.eprintl "block 2" >>= fun () ->
                Lwt_unix.sleep 2. >>= fun () ->
                raise Pervasives.Exit)
               <?>
               w
            )
            (fun () -> Lwt_io.eprintl "cleanup 2")
          >>= fun () ->
          Lwt_unix.sleep 2.)
         <?>
         w
      )
      (fun () -> Lwt_io.eprintl "cleanup 1")
  in

  Lwt_main.run p

So it seems that using Lwt.wait allows one to have more control over what actions the program does before it terminates and at what point it terminates. But in my case, I only want to free up some resources before the program exits. So another workaround I was thinking of using is the one I mentioned above: using a global reference to keep track of the objects to be cleaned up. This workaround is less elegant and uses mutability, but I understand the control flow better in this case:

open Lwt.Infix

module StringSet = Set.Make(String)
let cleanups = ref StringSet.empty

let () =
  Lwt_unix.on_signal
    Sys.sigint
    (fun _ ->
       StringSet.elements !cleanups
       |> Lwt_list.iter_s (fun e -> Lwt_io.eprintlf "cleanup %s" e)
       |> Lwt_main.run;
       failwith "SIGINT"
    )
  |> ignore;

  let p =
    Lwt.finalize
      (fun () ->
         Lwt_io.eprintl "finalizer 1" >>= fun () ->
         cleanups := StringSet.add "1" !cleanups;
         Lwt_unix.sleep 2. >>= fun () ->
         Lwt.finalize
           (fun () ->
              cleanups := StringSet.add "2" !cleanups;
              Lwt_io.eprintl "finalizer 2" >>= fun () ->
              Lwt_unix.sleep 2. >>= fun () ->
              raise Pervasives.Exit)
           (fun () -> Lwt_io.eprintl "cleanup 1" >|= fun () ->
             cleanups := StringSet.remove "1" !cleanups)
         >>= fun () ->
         Lwt_unix.sleep 2.
      )
      (fun () -> Lwt_io.eprintl "cleanup 2" >|= fun () ->
        cleanups := StringSet.remove "2" !cleanups)
  in

  Lwt_main.run p

I think both of these workarounds can be used in combination with Lwt.finalize to create a helper that ensures that the cleanups are run when the program is interrupted. I think Lwt.pick instead of Lwt.choose (<?>) is more suitable for the workaround using Lwt.wait, as it cancels the other thread, but I may be wrong - it seems to get cancelled anyway somehow.

gaborigloi · September 30, 2017, 11:44am

Actually I’ve realized that using Lwt.pick instead of Lwt.choose on the main thread does work as expected, the program is terminated immediately, and all the finalizers get executed:

open Lwt.Infix

let () =
  let w,u = Lwt.wait () in
  Lwt_unix.on_signal
    Sys.sigint
    (fun _ ->
       Lwt.wakeup_exn u (Failure "Caught SIGINT"))
  |> ignore;

  let t () =
    Lwt.finalize
      (fun () ->
         Lwt_io.eprintl "block 1" >>= fun () ->
         Lwt_unix.sleep 2. >>= fun () ->
         Lwt.finalize
           (fun () ->
              Lwt_io.eprintl "block 2" >>= fun () ->
              Lwt_unix.sleep 2. >>= fun () ->
              raise Pervasives.Exit)
           (fun () -> Lwt_io.eprintl "cleanup 2")
         >>= fun () ->
         Lwt_unix.sleep 2.
      )
      (fun () -> Lwt_io.eprintl "cleanup 1")
  in
  let t () = Lwt.pick [t (); w] in
  Lwt_main.run (t ())

Output:

$ ./prog 
block 1
^Ccleanup 1
Fatal error: exception Failure("Caught SIGINT")
$ ./prog 
block 1
block 2
^Ccleanup 2
cleanup 1
Fatal error: exception Failure("Caught SIGINT")
$ ./prog 
block 1
block 2
cleanup 2
cleanup 1
Fatal error: exception Pervasives.Exit
$

It also works when I combine threads with Lwt.join:

open Lwt.Infix

let () =
  let w,u = Lwt.wait () in
  Lwt_unix.on_signal
    Sys.sigint
    (fun _ ->
       Lwt.wakeup_exn u (Failure "Caught SIGINT"))
  |> ignore;

  let t () =
    Lwt.finalize
      (fun () ->
         Lwt_io.eprintl "block 1" >>= fun () ->
         Lwt_unix.sleep 2. >>= fun () ->
         let t2 () =
           Lwt.finalize
             (fun () ->
                Lwt_io.eprintl "block 2" >>= fun () ->
                Lwt_unix.sleep 2.)
             (fun () -> Lwt_io.eprintl "cleanup 2")
         in
         let t3 () =
           Lwt.finalize
             (fun () ->
                Lwt_io.eprintl "block 3" >>= fun () ->
                Lwt_unix.sleep 2.)
             (fun () -> Lwt_io.eprintl "cleanup 3")
         in
         Lwt.join [t2 (); t3 ()] >>= fun () ->
         Lwt_unix.sleep 2.
      )
      (fun () -> Lwt_io.eprintl "cleanup 1")
  in
  let t () = Lwt.pick [t (); w] in
  Lwt_main.run (t ())

Output:

$ ./prog 
block 1
^Ccleanup 1
Fatal error: exception Failure("Caught SIGINT")
$ ./prog 
block 1
block 3
block 2
cleanup 3
cleanup 2
^Ccleanup 1
Fatal error: exception Failure("Caught SIGINT")
$ ./prog 
block 1
block 3
block 2
^Ccleanup 3
cleanup 2
cleanup 1
Fatal error: exception Failure("Caught SIGINT")
$

I also noticed that when a background thread isn’t connected to the main one (e.g. uses let _ = ...), its finalizer do not get executed at all when I use this method:

open Lwt.Infix

let () =
  let w,u = Lwt.wait () in
  Lwt_unix.on_signal
    Sys.sigint
    (fun _ ->
       Lwt.wakeup_exn u (Failure "Caught SIGINT"))
  |> ignore;

  let t () =
    Lwt.finalize
      (fun () ->
         Lwt_io.eprintl "block 1" >>= fun () ->
         Lwt_unix.sleep 2. >>= fun () ->
         let _ =
           Lwt.finalize
             (fun () ->
                Lwt_io.eprintl "block 2" >>= fun () ->
                Lwt_unix.sleep 2.)
             (fun () -> Lwt_io.eprintl "cleanup 2")
         in
         Lwt_unix.sleep 2.
      )
      (fun () -> Lwt_io.eprintl "cleanup 1")
  in
  let t () = Lwt.pick [t (); w] in
  Lwt_main.run (t ())

Output:

# all finalizers are executed after normal termination:
$ ./prog 
block 1
block 2
cleanup 2
cleanup 1
$ ./prog 
block 1
^Ccleanup 1
Fatal error: exception Failure("Caught SIGINT")
$ ./prog 
block 1
block 2
^Ccleanup 1
Fatal error: exception Failure("Caught SIGINT")
$

When I tried to use this method for a more complicated program with a recursive Lwt thread, it did not work, not even when I joined all the background threads together after the recursive thread , so I still had to go for the ugly workaround that I described above.

gaborigloi · November 23, 2017, 12:44pm

UPDATE: I’ve found an issue with the ugly workaround I described above (using signal handlers that stop the program by throwing an exception):
When systemd terminates the program using SIGTERM, it will output a backtrace, and will appear to be in the “failed” state. I tried wrapping the whole program with a try/with block to suppress the exception, but it didn’t work sadly. I’ve also tried setting async_exception_hook to a no-op, but that didn’t work either.

I’ve also experimented with using Lwt_main.at_exit instead of signal handlers for running to cleanup code:

(* compile with
   ocamlfind ocamlc -o prog -linkpkg -package lwt -package lwt.unix prog.ml
*)
open Lwt.Infix

module StringSet = Set.Make(String)
let cleanups = ref StringSet.empty

let () =
  Lwt_main.at_exit
    (fun () ->
       StringSet.elements !cleanups
       |> Lwt_list.iter_s (fun e -> Lwt_io.eprintlf "cleanup %s" e)
    );

  let p =
    Lwt.finalize
      (fun () ->
         Lwt_io.eprintl "finalizer 1" >>= fun () ->
         cleanups := StringSet.add "1" !cleanups;
         Lwt_unix.sleep 2. >>= fun () ->
         Lwt.finalize
           (fun () ->
              cleanups := StringSet.add "2" !cleanups;
              Lwt_io.eprintl "finalizer 2" >>= fun () ->
              Lwt_unix.sleep 2. >>= fun () ->
              raise Pervasives.Exit)
           (fun () -> Lwt_io.eprintl "cleanup 1" >|= fun () ->
             cleanups := StringSet.remove "1" !cleanups)
         >>= fun () ->
         Lwt_unix.sleep 2.
      )
      (fun () -> Lwt_io.eprintl "cleanup 2" >|= fun () ->
        cleanups := StringSet.remove "2" !cleanups)
  in

  Lwt_main.run p

But it seems that the function registered with at_exit does not get called when the program receives a signal (for example SIGINT after pressing CTRL+C):

$ ./prog 
finalizer 1
finalizer 2
cleanup 1
cleanup 2
Fatal error: exception Pervasives.Exit
$ ./prog
finalizer 1
^C⏎        
$

Topic		Replies	Views
Exception hidden from main execution in Lwt Learning lwt	4	822	April 28, 2020
[ANN] First release of Lwt-exit Community announce	0	670	December 8, 2020
Lwt.catch does not seem to do what I expect Ecosystem lwt	10	949	August 17, 2022
Signal handling checkpoints and Mutex.lock Learning	20	569	June 24, 2024
Lwt: Why does this piece of code work !? Learning lwt	10	1351	November 23, 2022

Lwt - how to catch exceptions raised by signal handlers?

Related topics