try
Unix.write fd buf offset len
with e ->
Log.debug (fun m -> m "Failed to send %a" Fmt.exn e); 0
My program exits. My mental model says that this should either succeed or throw an error. Is there a case that would cause it to just simply exit without throwing the exception?
It may be that the output channels are not being flushed before exiting (this should not normally happen, but…). Did you try flushing them explicitly (either Format or Printf, not sure what you are using).
In a C/POSIX program using sockets you almost always want to set an ignore (SIG_IGN) signal handler for SIGPIPE, so that unix write will set an EPIPE error instead of generating a SIGPIPE interrupt which will terminate the program, upon attempting to write to a socket that is not open for reading.
Presumably the same is true if you want an EPIPE exception in ocaml instead of SIGPIPE. So I suggest you do the necessary with Sys.signal.
The ideal way to avoid the signal is not use write, but send with MSG_NOSIGNAL, so that you do not have to install a signal handler. Unfortunately, the Unix module does not expose MSG_NOSIGNAL (although it is part of POSIX nowadays).
SIGPIPE is not in the Unix module, it’s in the Sys module (Sys.sigpipe). As far as I can see MSG_NOSIGNAL is not included in Unix.msg_flag. The documentation only lists MSG_OOB, MSG_DONTROUTE and MSG_PEEK
As it happens I don’t usually bother with the MSG_NOSIGNAL flag for sending to sockets when writing C code: I usually just ignore SIGPIPE and have done with it. SIGPIPE is only really useful for small programs intended to have their stdin and stdout piped at the unix shell, which is unlikely to involve opening and writing to sockets or opening other pipes/fifos. If I have a program which uses sockets or explicit pipes/fifos I normally ignore SIGPIPE at the outset and have done with it. EPIPE is what you need for those. So the fact that MSG_NOSIGNAL is not wrapped by ocaml doesn’t worry me.
In my experience programming in distributed systems (not in Ocaml – but in many other languages), every systems-hacker has that experience where they learn about reset sockets, SIGPIPE, ignoring the signal, EPIPE, and the necessity of checking return-codes from write (and other syscalls) … in that order grin. For many of us, it’s drilled into us by some vary gruesome screwup. Almost a rite of passage to forget some part of that checklist, and learn the hard way never to do it again.
Yes, sockets seem to abound in such things, where basically the defaults are wrong: SIGPIPE, SO_REUSEADDR, FD_CLOEXEC: and for some reason the tutorials don’t seem to draw attention to them.