For a long time I wondered about signals. Essentially, I would like to program without having to think about them at all.
For example, POSIX says that close can fail with EBADF (which I am happy to handle) and EINTR (the system call was interrupted) which I would rather not handle since presumably it involves doing something awful like re-trying the close call. So I want a close call that either fails with EBADF, or succeeds and closes the fd.
Linux man page for close has the following:
Retrying the close() after a failure return is the wrong thing to do,
since this may cause a reused file descriptor from another thread to
be closed. This can occur because the Linux kernel always releases
the file descriptor early in the close operation, freeing it for
reuse; the steps that may return an error, such as flushing data to
the filesystem or device, occur only later in the close operation.
But this seems to stop short of saying that close will always close the fd even if EINTR is returned (āsteps that may return an errorā could be interpreted as those steps which may return non-EINTR errors). So it is not even clear to me how (in a concurrent setting) you can actually close a file descriptor cleanly.
There is even an LWN article about POSIX and Linux behaviours of close (POSIX says to retry close in a loop; Linux says to never do this):
Is there any way to program without having to consider these signals? Presumably some signals are fine (those that forcibly kill the process because something terrible has happened). But the rest I just donāt want to think about; I would rather deal with the errors that occur as a result of using the system calls, not from some unrelated signal-type thing.
How do Core and other libraries deal with this?
At a more basic level: what is the safe way to (portably?) close a file descriptor? And why is this so hard???
One solution is what is done in containers (specifically CCIO), that allows you to run your code in some kind of context, and lets the library handle most errors for instance:
CCIO.with_out "some/file" (fun out_ch ->
... (* write to the output channel *)
)
This way, the library can automatically handle closing the file descriptor after it calls the function provided by the user.
Ah, but the point of my post is that it seems impossible to even close an fd in a portable way. So I donāt see that CCIO can provide any such guarantee (and indeed it seems to just call OCamlās standard Unix.close).
Because NFS has no āreserve some space for meā operation and your local machine buffered the data until close() was called, you can only get an āout of disk spaceā error on the close(); the first the remote fileserver heard of your new data is when your local machine started sending it writes as you closed the file. Now suppose that this close() takes long enough that it gets interrupted with an EINTR. If the file descriptor is now invalid, your program has no way to find out that the data it thought it had written has in fact been rejected.
Thus itās at least sensible for Unix systems to worry about this potential case and decide that close() should not close the file descriptor in the EINTR case.
Iām not sure I agree with his reasoning. If you want to detect errors due to data not being written you need to fsync and pick up the errors from that. Assuming a valid fd, close should always close it. This is the only interface that makes sense in a concurrent setting.
Could you be more precise as to what the problem is here ?
As far as I can recall bos tries to be as robust as possible. Which means that if EINTR is raised: we retry. Now if the fd was in fact closed this will be EBADF and ignored (see the uses of close in the module), if itās not then we will retry to close it.
And I grant that the Linux man page is not completely clear that close does, in fact, guarantee to close the fd. But from what I can see, this is the intention.
For the concurrent case, I suppose we could allocate a global lock and require all uses of close to take the lock first, and then potentially retry the close (on HP/UX et al?)ā¦ (Will to live slowly ebbing away)
Iām unable to think of a version of Unix where close can actually be interrupted by a signal. Yes, POSIX claims it is possible, and I suppose a version would be standards conformant if it permitted it to happen, but Iāve never seen it in a real implementation. After using doing Unix systems programming for 33 years or so now, Iāve also never seen close(2) fail on a valid descriptor and Iāve never seen a bug caused by failure to test for EINTR as a return value from close(2) with retry.
I might be horribly mistaken, but I think you can just ignore that completely. Certainly if you did ignore it, your code would look like essentially everyone elseās.
This permits the behavior that occurs on Linux and many other
implementations, where, as with other errors that may be reported by
close(), the file descriptor is guaranteed to be closed. However, it
also permits another possibility: that the implementation returns an
EINTR error and keeps the file descriptor open. (According to its
documentation, HP-UXās close() does this.) The caller must then once
more use close() to close the file descriptor, to avoid file
descriptor leaks. This divergence in implementation behaviors
provides a difficult hurdle for portable applications, since on many
implementations, close() must not be called again after an EINTR
error, and on at least one, close() must be called again. There are
plans to address this conundrum for the next major release of the
POSIX.1 standard.
Iāll take peopleās word for it that HP-UX used to do this. I would ignore it. That behavior is clearly broken. System calls should be interruptable (at least to userlandās knowledge) only if theyāre long. Short calls of guaranteed duration like close(2) should never allow themselves to be interrupted. Donāt worry about it, donāt program on the assumption that it can happen.
(In the exceptionally unlikely event that you leak a single file descriptor every year this way in a long lived process, you will never notice anyway. Ignore it.)