Dune built executable killed by zsh on macOS

I can compile this simple test.ml:

let () =
  at_exit (fun () -> print_endline "Exiting.") ;
  let _ = Sys.(signal sigint (Signal_handle (fun _ -> print_endline "Received SIGINT." ; exit 100))) in
  Unix.sleep 100

by hand on the command line with the same options used by dune:

ocamlopt -o test -strict-sequence -strict-formats -short-paths -keep-locs -g ~/.opam/default/lib/ocaml/unix.cmxa test.ml

and it worked as expected. However if I build with dune the executable is immediately killed by zsh. And weirdly enough once this happens even after changing the code to a simple “hello world” it still dies (and “hello world” works again after dune clean – ADD: not super sure that this flip-flop behavior is reliably reproducible).

Originally I had this problem with a much larger code base so I had variously thought it is a problem with Lwt or Unix module. And it is very sensitive to certain code change (like adding a Sys.signal call or adding print_endline to an at_exit function would cause the exe to be killed but otherwise it worked). However after some sleuthing it now appears to be related to how dune builds the artifacts. But all my opam installed code appeared to be working fine so that would be weird too.

I am on macOS 11.1 ARM64. Ocaml is installed through opam 2.0.7 native. Ocaml is at version 4.10.2. And dune is 2.8.1.

My dune is minimal:

(executable
 (name test)
 (promote (until-clean))
 (libraries unix))

Any idea on how to proceed from here?

I checked my dune build log and -thread is not in the compiler options. I added -thread to my command line build and it worked the same as before. Is this something that is worth opening an issue on github for? It would be nice to see if anyone can reproduce this behavior.

You are right, I misremembered the situation with dune and threads. Sorry, I do not have other clues.

Just to confirm, this is what you are observing:

$ dune build
$ ./_build/default/test.exe
Received SIGINT.

?

The correct behavior from command line build after keying ctrl-c:

ocaml % ./test    
^CReceived SIGINT.
Exiting.

The problem with dune build:

ocaml % ./test.exe
zsh: killed     ./test.exe

Note the killing by zsh is immediate (so something is wrong) and there is no keyboard input.

Definitely looks like a bug. Could you open a bug on github.com/ocaml/dune?

I opened a new issue yesterday: Executable built with dune killed by zsh on macOS · Issue #4135 · ocaml/dune · GitHub

I also gave more detail on some other observed oddities:

And weirdly enough there seems to be some hysteresis in the behavior related to dune build. Starting with the above example and after building an executable that is always killed, the killing persists even if I hand edit the code to only contain print_endline "Hello world" . This “Hello world” from editing only works correctly after I do a dune clean . However if I cp or mv the same code from another existing file then it works right away without need to dune clean . Does dune decide on how/when to build certain artifacts based on more than just time stamps?

I can’t reproduce your issue on my amd64 machine running Linux.

  • What is the exit status of the command? You can do echo $? after running it. On my Linux system the exit status of a process that is killed is 128 + signal number, e.g. 130 for SIGINT, 143 for SIGTERM, and so on.
  • Does this also occur for other shells or is it specific to zsh?
  • You might also want to know what system calls the command makes. I’m not very familiar with OSX, but I think dtruss _build/test.exe will do that. You could compare the output between the working binary and the one that gets killed.

This is almost definitely a codesigning bug. The new ARM64 Macs have much stricter validation when it comes to codesigning, and the kernel will refuse to run unsigned code (and sends a SIGTERM).

It is strange, though, how involving dune breaks it.

To make matters worse, there was (and still is?) a bug with the codesigning tool with the latest Xcode such that it will often fail with an ambiguous error about the codesign_allocate tool being unusable.

Luckily some people over at the golang project figured out that if you make a new inode for the binary the tool will work on the new copy: https://github.com/golang/go/issues/42684#issuecomment-729543666

@zhtprog does codesigning the binary fix the issue? Obviously it would be a temporary solution, but a solution nonetheless.

2 Likes

@droyo echo $? reports 137 so presumably that is 9 or sigkill as reported by zsh. dtruss did not produce any information despite requiring privilege to run. Not sure if the following has anything to do with it:

dtrace: system integrity protection is on, some features will not be available

@zbaylin Yes that sounds like a good guess. Does codesign have anything to do with source file or intermediary inode? I observed that editing the source file of a killed exe does not change the kill until I run dune clean or change the source file inode. I could try to figure out how to codesign and that should show whether this is the reason.

1 Like

@zbaylin You were absolutely right that codesigning is the reason. After a fresh dune build I did codesign -s - test.exe and now it runs the same as the command line build. Of course it still remains a mystery why this happens as my OPAM installed packages so far have all appeared to work as expected. I will update github issue as well.

2 Likes