Delivering SIGSEGV to self in native code

mjambon · May 11, 2021, 3:12am

I was testing whether a program reports the termination status of a child correctly. So I thought I’d have the child process run this:

Unix.kill (Unix.getpid ()) Sys.sigsegv

But the child seemed unaffected and exited successfully with code 0.

It turns out the expected crash doesn’t occur with native code but occurs with bytecode:

$ cat crash.ml
Unix.kill (Unix.getpid ()) Sys.sigsegv;;
print_endline "ok"
$ ocamlc -o crash.bytecode unix.cma crash.ml
$ ocamlopt -o crash.native unix.cmxa crash.ml
$ ./crash.bytecode 
Segmentation fault (core dumped)
$ ./crash.native          # <-- expected to die similarly
ok
$ echo $?
0

The intent of the print_endline was to trigger an interrupt allowing the signal to be delivered (I may not understand those things correctly). I also tried adding a Unix.gettimeofday () but it changed nothing. Other signals I tried cause the program termination. Both sigterm and sigusr1 cause program termination in native code, as expected.

I’m on Linux. Is this some sort of unspecified system behavior, or is it a bug in ocaml, or something else?

xavierleroy · May 11, 2021, 7:06am

The native-code runtime system catches SIGSEGV and (on some systems) SIGBUS in an attempt to detect stack overflows and recover from them. Better not mess with these signals. If you really need to kill a process reliably, use SIGKILL.

gadmm · May 11, 2021, 2:36pm

Since I am digging into that part of the runtime currently, I am wondering what is your use-case and also I think there is a bug there.

Note that this is about OS signal handlers, not OCaml signal handlers, so there is no need to poll to trigger signal handlers.

As explained by Xavier, native OCaml has its own signal handler for segv. It goes as follows: the segv signal is processed by OCaml’s handler which checks whether the faulting address corresponds to a stack overflow. When it does not, OCaml removes its own signal handler and returns to re-start where the segfault happened, expecting the segfault to happen again, now handled by the default handler that aborts the program.

In your case, the segv signal is ignored the first time, and no second segv is sent. The bug is that the execution continues, but OCaml is now in an inconsistent state (no stack overflow detection henceforth). It would be better to treat the segv fatally in all cases. To fix this, one can raise the signal again in the signal handler, or one could directly abort, or call caml_fatal_error (which will call the user-supplied error hook if any before aborting). Only the first two options are explicitly signal-safe.

mjambon · May 11, 2021, 6:45pm

I’m just being curious. The use case as I mentioned was a test in which I wanted to make a child process crash on purpose, but I didn’t want to make an actual illegal memory access. In the end it doesn’t matter, I can just raise another signal for testing purposes.

Thank you both for the explanations!

Topic		Replies	Views
Signal management in Sys (Sys.set_signal) Learning	7	2220	February 25, 2018
Stack overflow reported as segfault (?) Learning	7	2423	October 19, 2021
How to debug a terminated by signal SIGSEGV (Address boundary error) Learning	3	7420	June 8, 2020
Debugging segmentation faults Learning segfault	15	1992	January 27, 2023
Code segfaulting in bytecode compiler but running in native compiler Learning	7	581	April 8, 2021

Delivering SIGSEGV to self in native code

Related topics