I had a hard time trying to reduce a segmentation fault observed with OCaml 5.0: I would prefer not to fill an issue before having something smaller to reproduce and before checking for sure that the fault is not in my code, but since I didn’t manage to progress for three weeks, I dare asking the community for some help!
Here are some steps that should reproduce the segmentation fault:
$ docker run -it --rm ocaml/opam:debian-11-ocaml-5.0 opam@...:~$ opam pin add https://github.com/thierry-martinez/stdcompat.git#disable-magic opam@...:~$ opam pin add https://github.com/thierry-martinez/metapp.git opam@...:~$ opam pin add https://github.com/thierry-martinez/metaquot.git#ocaml-5.0-segfault opam@...:~$ opam install refl
The segmentation fault occurs while
metaquot.ppx preprocesses the source file ppx_refl.ml, and disappears when
Gc.minor () is called before preprocessing each expression (see Solve segfault with OCaml 5.0 · thierry-martinez/metaquot@e19ef99 · GitHub). The segmentation fault does not occur with OCaml 4.14 and below, and disappears with little variations in the code, either when I try to reduce the size of
ppx_refl.ml or when I change the code of
metaquot.ml, or even when I try to embed the relevant parts of
metaquot to make the example more standalone.
There is no magic nor FFI calls in the code of
stdcompat (as pinned),
metapp, and I believe neither in
valgrind gives the following backtrace:
==1294611== Process terminating with default action of signal 11 (SIGSEGV) ==1294611== Bad permissions for mapped region at address 0x56D9E8 ==1294611== at 0xA5C849: atomic_store_relaxed (platform.h:68) ==1294611== by 0xA5C849: mark_slice_darken (major_gc.c:690) ==1294611== by 0xA5C849: do_some_marking (major_gc.c:720) ==1294611== by 0xA5C849: mark (major_gc.c:730) ==1294611== by 0xA5CD46: major_collection_slice (major_gc.c:1241) ==1294611== by 0xA5D704: caml_major_collection_slice (major_gc.c:1365) ==1294611== by 0xA4CEFC: caml_poll_gc_work (domain.c:1523) ==1294611== by 0xA60E4C: caml_check_urgent_gc (minor_gc.c:867) ==1294611== by 0xA6BC6E: caml_c_call (in /home/tmartine/tmp/ppx_segfault/_build/default/ppx_segfault.exe) ==1294611== by 0x4DB151F: ??? ==1294611== by 0x4DB7AC7: ??? ==1294611==
I wish I would be able to make a more standalone and small example, but I am stuck in how I can reduce it without making the segmentation faut disappears. Any help for understanding the problem will be appreciated! (And even knowing whether the problem can be reproduced or not in various settings could be useful!) Thank you very much!