Signal management in Sys (Sys.set_signal)

Hello,

I’m facing strange behaviors in management of signals (Sys.set_signal). And I’ve some questions (see at end of this post)

Here is the Ocaml code

let f signal =
   (* Do not do that in real (never usr print) *)
  Fmt.pr "Got signal %d@." signal;
  exit 1

let route_sig ()  =
  let open Sys in
  Fmt.pr "Route %d@." sigint;
  set_signal sigint (Signal_handle f)

let _ =
  route_sig ();
  Fmt.pr "Waiting.....@.";
  Unix.sleep 60;
  Fmt.pr "Finalizing.....@."

Here is the C code

#include
#include
#include
#include
static void sig_handler(int signo)
{
    /* Never use printf in signal - just for demo */
  printf("Got signal %d\n", signo);
  exit (1);
}

static void  route_sig ()
{
    printf("Route %d\n", SIGINT);
    signal(SIGINT, sig_handler);
}


int main(void)
{
    route_sig();
    printf("Waiting....\n");
    sleep(60);
    printf("Finalizing.....\n");
    return 0;
}


**While running the Ocaml program and then sending a sigint, the sig_handler is called after the Unix.sleep**
./sigint.ml.out
[000.0] Route -6
[000.0] Waiting....
[008.1] ^C
[060.2] Got signal -6

With the C program, the sig_handler is called immediately

./sigint.c.out
[000.0] Route 2
[000.0] Waiting....
[007.8] ^C
[007.8] Got signal 2

That probably means that the Ocaml sig_handler is not called from the system signal but through a kind of callback;
That really sounds strange as it is not mentionned in the Sys documentation.

Now, supposing that we want to catch the sigsegv
C example

#include
#include
#include
#include
static void sig_handler(int signo)
{
    /* Never use printf in signal - just for demo */
  printf("Got signal %d\n", signo);
  exit (1);
}

static void  route_sig ()
{
    printf("Route %d\n", SIGSEGV);
    signal(SIGSEGV, sig_handler);
}


int main(void)
{
    int *a=0;
    route_sig();
    printf("Waiting....\n");
    *a=1;
    printf("Finalizing.....\n");
    return 0;
}
./sigsegv.c.out
[000.0] Route 11
[000.1] Waiting....
[000.1] Got signal 11

The same in OCaml + Buggy C part

C:
#include 
void sigv_c_fct ()
{
    int *a= NULL;
    *a=1;
}

Ocaml:
external sigv_c_fct:unit -> unit  = "sigv_c_fct"

let f signo =
   (* Do not do that in real (never usr print) *)
  Fmt.pr "Got signal %d@." signo;
  exit 1

let route_sig ()  =
  let open Sys in
  Fmt.pr "Route %d@." sigsegv;
  set_signal sigsegv (Signal_handle f)

let _ =
  route_sig ();
  Fmt.pr "Waiting.....@.";
  sigv_c_fct ();
  Fmt.pr "Finalizing.....@."
./sigsegv.ml.out
[000.0] Route -10
[000.1] Waiting....
and never end

strace shows that SIGSEGV is generated until I sigkill the process

rt_sigreturn({mask=[]})                 = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---
rt_sigreturn({mask=[]})                 = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---

That shows that the system signal has been released.
The main process redo the faulty instruction (PC has not changed) and then regenerate the signal
The Ocaml handler is never called
I suspect that sigbus/sigill may do the same.


Q1 : I there something I miss in the Sys.set_signal usage?

For some reason (very small permanent storage), I cannot generate core while processes are kill with a SEGV.
Routing sigsegv in OCaml should help to store some useful informations of my process when it crashes but I cannot do it like that.

Q2: How can I go back in Ocaml in the signal context / is it possible ?

Thanks,
Erwan

1 Like

In fact, OCaml does not treat signals in a strictly asynchronous fashion. On receiving a signal, OCaml records the receipt of the signal but the signal handling function will only be executed at certain checkpoints.

This is from “Unix system programming in OCaml” by Xavier Leroy and Didier Rémy, both authors of OCaml. This explains why the signal is not handled during the Unix.sleep.

Hi
Thanks for your answer.
However, that does not explain why sigsegv handling function is never executed.
Erwan

Oh yes it does. After the signal is recorded for later processing, the main program resumes… by re-executing the faulting instruction that causes the SEGV in the first place!

Ok, bad fast reply on my side.
The loop (segv -> sighandler) is exactly what I saw and that’s my actual problem :frowning:

I actually have a segv and I cannot store the core (no writable storage)
To debug it a have to retrieve some infos from ocaml part.
As It seems that I cannot use Sys.set_signal for that, is there a way to to go back in a ocaml context in the system signal handler ? (And why could we do a set_signal on a sigsegv if we cannot use it :no_mouth: ? )

That’s a tough question! OCaml just makes available the whole list of POSIX signals… even though some of them like SIGSEGV or SIGFPE are generally raised synchronously, in a way that OCaml cannot handle. Better documentation is in order.

Back to your bigger question: it’s exceedingly hard to safely recover from a segv, even from C ! Even if you handle the segv signal from C, you won’t know in which state the program is, so running any nontrivial code (e.g. printf()) at this point is unsafe, let alone trying to callback into Caml.

If you can’t write a core dump, could you at least run the program under a debugger? or monitor it with ptrace() from another process?

1 Like

Thanks for the answers
I’ll use a debugger
Cheers

I took a look at this as while @lindig is correct I would have expected the OCaml code to have been executed by the signal handler during the execution of a C system call, especially sleep. The code that handles this case is in asmrun/signals_asm.c in the handle_signal function.

I suspect that Fmt.pr is buffering the output or something as when I used the following modified version of the original code it worked as expected (note the %! at the end of the printf’s to force a flush)

let f signal =
   (* Do not do that in real (never usr print) *)
  Printf.printf "Got signal %d\n%!" signal;
  exit 1

let route_sig signum =
  let open Sys in
  Printf.printf "Route %d\n%!" signum;
  set_signal signum (Signal_handle f)

let _ =
  route_sig Sys.sigint;
  Printf.printf "Waiting.....\n%!";
  Unix.sleep 60;
  Printf.printf "Finalizing.....\n%!"

In the SEGV case the C does not release and acquire the runtime lock around the buggy code and as a result the signal is queued up and the infinite loop noted by @xavierleroy occurs.

I may have missed something though as the signal handling is deep in the runtime.