I had a hard time trying to reduce a segmentation fault observed with OCaml 5.0: I would prefer not to fill an issue before having something smaller to reproduce and before checking for sure that the fault is not in my code, but since I didn’t manage to progress for three weeks, I dare asking the community for some help!
Here are some steps that should reproduce the segmentation fault:
$ docker run -it --rm ocaml/opam:debian-11-ocaml-5.0
opam@...:~$ opam pin add https://github.com/thierry-martinez/stdcompat.git#disable-magic
opam@...:~$ opam pin add https://github.com/thierry-martinez/metapp.git
opam@...:~$ opam pin add https://github.com/thierry-martinez/metaquot.git#ocaml-5.0-segfault
opam@...:~$ opam install refl
The segmentation fault occurs while metaquot.ppx
preprocesses the source file ppx_refl.ml, and disappears when Gc.minor ()
is called before preprocessing each expression (see Solve segfault with OCaml 5.0 · thierry-martinez/metaquot@e19ef99 · GitHub). The segmentation fault does not occur with OCaml 4.14 and below, and disappears with little variations in the code, either when I try to reduce the size of ppx_refl.ml
or when I change the code of metaquot.ml
, or even when I try to embed the relevant parts of ppxlib
with metaquot
to make the example more standalone.
There is no magic nor FFI calls in the code of stdcompat
(as pinned), metaquot
, metapp
, and I believe neither in ppxlib
.
valgrind
gives the following backtrace:
==1294611== Process terminating with default action of signal 11 (SIGSEGV)
==1294611== Bad permissions for mapped region at address 0x56D9E8
==1294611== at 0xA5C849: atomic_store_relaxed (platform.h:68)
==1294611== by 0xA5C849: mark_slice_darken (major_gc.c:690)
==1294611== by 0xA5C849: do_some_marking (major_gc.c:720)
==1294611== by 0xA5C849: mark (major_gc.c:730)
==1294611== by 0xA5CD46: major_collection_slice (major_gc.c:1241)
==1294611== by 0xA5D704: caml_major_collection_slice (major_gc.c:1365)
==1294611== by 0xA4CEFC: caml_poll_gc_work (domain.c:1523)
==1294611== by 0xA60E4C: caml_check_urgent_gc (minor_gc.c:867)
==1294611== by 0xA6BC6E: caml_c_call (in /home/tmartine/tmp/ppx_segfault/_build/default/ppx_segfault.exe)
==1294611== by 0x4DB151F: ???
==1294611== by 0x4DB7AC7: ???
==1294611==
I wish I would be able to make a more standalone and small example, but I am stuck in how I can reduce it without making the segmentation faut disappears. Any help for understanding the problem will be appreciated! (And even knowing whether the problem can be reproduced or not in various settings could be useful!) Thank you very much!