System calls, Unix.close, signals, EINTR

Hi,

For a long time I wondered about signals. Essentially, I would like to program without having to think about them at all.

For example, POSIX says that close can fail with EBADF (which I am happy to handle) and EINTR (the system call was interrupted) which I would rather not handle since presumably it involves doing something awful like re-trying the close call. So I want a close call that either fails with EBADF, or succeeds and closes the fd.

Linux man page for close has the following:

Retrying the close() after a failure return is the wrong thing to do,
since this may cause a reused file descriptor from another thread to
be closed. This can occur because the Linux kernel always releases
the file descriptor early in the close operation, freeing it for
reuse; the steps that may return an error, such as flushing data to
the filesystem or device, occur only later in the close operation.

But this seems to stop short of saying that close will always close the fd even if EINTR is returned (ā€œsteps that may return an errorā€ could be interpreted as those steps which may return non-EINTR errors). So it is not even clear to me how (in a concurrent setting) you can actually close a file descriptor cleanly.

There is even an LWN article about POSIX and Linux behaviours of close (POSIX says to retry close in a loop; Linux says to never do this):

Is there any way to program without having to consider these signals? Presumably some signals are fine (those that forcibly kill the process because something terrible has happened). But the rest I just donā€™t want to think about; I would rather deal with the errors that occur as a result of using the system calls, not from some unrelated signal-type thing.

How do Core and other libraries deal with this?

At a more basic level: what is the safe way to (portably?) close a file descriptor? And why is this so hard???

Thanks

Edit: an Austin group bug report that is relevant: 0000529: fildes unspecified on close()'s [EINTR] - Austin Group Defect Tracker

One solution is what is done in containers (specifically CCIO), that allows you to run your code in some kind of context, and lets the library handle most errors for instance:

CCIO.with_out "some/file" (fun out_ch ->
  ... (* write to the output channel *)
)

This way, the library can automatically handle closing the file descriptor after it calls the function provided by the user.

Ah, but the point of my post is that it seems impossible to even close an fd in a portable way. So I donā€™t see that CCIO can provide any such guarantee (and indeed it seems to just call OCamlā€™s standard Unix.close).

close is especially hard, since if it returns EINTR, the state of the file descriptor is unspecified (according to POSIX http://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html). AFAICT, lwt calls close only once (in contrast to other system calls which are wrapped to be retried on EINTR https://github.com/ocsigen/lwt/blob/3.1.0/src/unix/lwt_unix.cppo.ml#L548-L568), bos calls close repeatedly (https://github.com/dbuenzli/bos/blob/v0.1.6/src/bos_os_cmd.ml#L53-L54) if it returns EINTR - EDIT: but provides and uses a close_no_err function just below which safely closes the file descriptor (by ignoring any exception).

Thereā€™s more discussion on the topic by Colin Percival at http://www.daemonology.net/blog/2011-12-17-POSIX-close-is-broken.html (ā€œclose is brokenā€), and Chrisā€™ https://utcc.utoronto.ca/~cks/space/blog/unix/CloseEINTR (which suggests that POSIX should keep the file descriptor intact if EINTR is returned). It even is worse, as pointed out by Colin, in a concurrent setting.

Bos has:

let rec close fd = try Unix.close fd with
| Unix.Unix_error (Unix.EINTR, _, _) -> close fd

So this only works in the single-threaded case (and under Linux it is probably wrong anyway in case EINTR is raised).

If we canā€™t even close a file descriptor properly, what hope is there? How long has this API been around? 40 years or more?

As Donald would sayā€¦ sad sad sad

From Chrisā€™ page:

Because NFS has no ā€˜reserve some space for meā€™ operation and your local machine buffered the data until close() was called, you can only get an ā€˜out of disk spaceā€™ error on the close(); the first the remote fileserver heard of your new data is when your local machine started sending it writes as you closed the file. Now suppose that this close() takes long enough that it gets interrupted with an EINTR. If the file descriptor is now invalid, your program has no way to find out that the data it thought it had written has in fact been rejected.

Thus itā€™s at least sensible for Unix systems to worry about this potential case and decide that close() should not close the file descriptor in the EINTR case.

Iā€™m not sure I agree with his reasoning. If you want to detect errors due to data not being written you need to fsync and pick up the errors from that. Assuming a valid fd, close should always close it. This is the only interface that makes sense in a concurrent setting.

Could you be more precise as to what the problem is here ?

As far as I can recall bos tries to be as robust as possible. Which means that if EINTR is raised: we retry. Now if the fd was in fact closed this will be EBADF and ignored (see the uses of close in the module), if itā€™s not then we will retry to close it.

1 Like

Under Linux, close is guaranteed to close the fd. So retrying does not make sense.

But I admit this is rather pernickety of me!

And I grant that the Linux man page is not completely clear that close does, in fact, guarantee to close the fd. But from what I can see, this is the intention.

And we still have the problem of what to do in the concurrent setting, where re-trying the close is definitely not right.

If someone can open an issue about it Iā€™ll give a shot at fixing this in the future.

For the concurrent case, I suppose we could allocate a global lock and require all uses of close to take the lock first, and then potentially retry the close (on HP/UX et al?)ā€¦ (Will to live slowly ebbing away)

Iā€™m not really serious! Special casing to the particular OS seems too awful to contemplate.

Well between this and spending days chasing a heisenbug where fds randomly get closed, I prefer the former.

Iā€™m unable to think of a version of Unix where close can actually be interrupted by a signal. Yes, POSIX claims it is possible, and I suppose a version would be standards conformant if it permitted it to happen, but Iā€™ve never seen it in a real implementation. After using doing Unix systems programming for 33 years or so now, Iā€™ve also never seen close(2) fail on a valid descriptor and Iā€™ve never seen a bug caused by failure to test for EINTR as a return value from close(2) with retry.

I might be horribly mistaken, but I think you can just ignore that completely. Certainly if you did ignore it, your code would look like essentially everyone elseā€™s.

(Just to restate: Iā€™m not really sure Iā€™m serious)

From close(2) - Linux manual page :

This permits the behavior that occurs on Linux and many other
implementations, where, as with other errors that may be reported by
close(), the file descriptor is guaranteed to be closed. However, it
also permits another possibility: that the implementation returns an
EINTR error and keeps the file descriptor open. (According to its
documentation, HP-UXā€™s close() does this.) The caller must then once
more use close() to close the file descriptor, to avoid file
descriptor leaks. This divergence in implementation behaviors
provides a difficult hurdle for portable applications, since on many
implementations, close() must not be called again after an EINTR
error, and on at least one, close() must be called again. There are
plans to address this conundrum for the next major release of the
POSIX.1 standard.

Iā€™ll take peopleā€™s word for it that HP-UX used to do this. I would ignore it. That behavior is clearly broken. System calls should be interruptable (at least to userlandā€™s knowledge) only if theyā€™re long. Short calls of guaranteed duration like close(2) should never allow themselves to be interrupted. Donā€™t worry about it, donā€™t program on the assumption that it can happen.

1 Like

(In the exceptionally unlikely event that you leak a single file descriptor every year this way in a long lived process, you will never notice anyway. Ignore it.)