Detect use-after-close of Unix file descriptors

On POSIX systems file descriptors (FD) are eagerly re-used. As a consequence,
use-after-close of a file descriptor is often not detected and in fact
harmful because it will manipulate a different file:

(* ocamlc -I +unix unix.cma -o foo foo.ml *)

let write fd msg =
  Unix.write fd (Bytes.unsafe_of_string msg) 0 (String.length msg) |> ignore

let main () =
  let fd_a = Unix.openfile "a.txt" Unix.[O_CREAT; O_RDWR] 0o666 in
  write fd_a "aaa";
  Unix.close fd_a;

  let _fd_b = Unix.openfile "b.txt" Unix.[O_CREAT; O_RDWR] 0o666 in
  write fd_a "aaa"; (* use after close, this goes into b *)
  Unix.close fd_a (* this closes b *)

let () = main ()

In the code above the write to fd_a goes into file b.txt because the FD for a.txt was re-used.

Here is a hack that could help detect it and I would like to discuss if this could be or should be integrated into a library:

  • Unix (Linux) represents an FD as 32 bit but OCaml on 64-bit architecture uses an int that has 63 available bits.
  • On open, encode both the FD as before and in addition the inode of the FD into the OCaml int.
  • On close, check that the FD still points to the same inode if inode information is present. Raise an exception otherwise.
  • FFI code that passes an FD back to C should be safe because the cast back to an int would discard the inode information.

This passes the OCaml test suite except for the

  • testsuite/tests/lib-unix/common/cloexec.ml

which uses Obj.magic and makes assumptions that a Unix file descriptor is an integer in the OCaml representation. This new representation would likewise break other code that does this.

Below is some code as a proof of concept against OCaml 4.14.1.

commit ed3a7835817b1a281f0738ade3de91322a68dcb9
Author: Christian Lindig <christian.lindig@cloud.com>
Date:   Fri Dec 22 16:34:45 2023 +0000

    Encode inode into FD for integrity check
    
    On POSIX systems file descriptors are eagerly re-used. As a consequence
    use-after-close of a filedescritor is often not detected and in fact
    harmful because it will manipulate a different file.
    
    On Linux, a file descriptor is 32bit wide and is respresented in OCaml
    as an int, which is 63 bits on a 64-bit architecture. We can use the
    extra bits for integrity checks:
    
    * on Unix.openfile, encode the file descriptor and the lsb of the inode
      into the OCaml value.
    * on Unix.close, check that the file descriptor still refers to the same
      inode. Raise an exception if this not the case; don't check for
      integrity if no inode bits are present.
    
    This patch breaks code that uses Obj.magic and the assumption that a
    file descriptor is an integer. The test OCaml test suite contains such a
    case for
    
    * testsuite/tests/lib-unix/common/cloexec.ml
    
    This patch is also not suitable for 32-bit architectures.
    
    Signed-off-by: Christian Lindig <christian.lindig@cloud.com>

diff --git a/otherlibs/unix/close.c b/otherlibs/unix/close.c
index 961b8cbff1..3dc3c4ef72 100644
--- a/otherlibs/unix/close.c
+++ b/otherlibs/unix/close.c
@@ -17,9 +17,31 @@
 #include <caml/signals.h>
 #include "unixsupport.h"
 
+#include <sys/types.h>
+#include <sys/stat.h>
+
 CAMLprim value unix_close(value fd)
 {
   int ret;
+  long ffd;
+
+  /* check integrity */
+  ffd = Long_val(fd);
+  if ((ffd & 0x0fffffff00000000) != 0) {
+    struct stat statbuf;
+    size_t inode;
+    int rc;
+    long cs;
+
+    caml_enter_blocking_section();
+    rc = fstat(Int_val(fd), &statbuf) ;
+    caml_leave_blocking_section();
+    if (rc == -1) uerror("fstat", Nothing);
+    inode = statbuf.st_ino;
+    cs = ((inode << 32) & 0x0fffffff00000000) | Long_val(fd);
+    if (cs != ffd) uerror("integrity", Nothing);
+  }
+
   caml_enter_blocking_section();
   ret = close(Int_val(fd));
   caml_leave_blocking_section();
diff --git a/otherlibs/unix/open.c b/otherlibs/unix/open.c
index cae1ce0fd0..ad10bd631b 100644
--- a/otherlibs/unix/open.c
+++ b/otherlibs/unix/open.c
@@ -25,6 +25,9 @@
 #endif
 #include <fcntl.h>
 
+#include <sys/types.h>
+#include <sys/stat.h>
+
 #ifndef O_NONBLOCK
 #define O_NONBLOCK O_NDELAY
 #endif
@@ -60,6 +63,10 @@ CAMLprim value unix_open(value path, value flags, value perm)
   CAMLparam3(path, flags, perm);
   int fd, cv_flags, clo_flags, cloexec;
   char * p;
+  struct stat statbuf;
+  size_t inode;
+  long ffd;
+  int rc;
 
   caml_unix_check_path(path, "open");
   cv_flags = caml_convert_flag_list(flags, open_flag_table);
@@ -83,5 +90,13 @@ CAMLprim value unix_open(value path, value flags, value perm)
 #if !defined(O_CLOEXEC)
   if (cloexec) unix_set_cloexec(fd, "open", path);
 #endif
-  CAMLreturn (Val_int(fd));
+  /* remember inode in msb */
+  caml_enter_blocking_section();
+  rc = fstat(fd, &statbuf);
+  caml_leave_blocking_section();
+  if (rc == -1) uerror("fstat", path);
+  inode = statbuf.st_ino;
+  ffd = ((inode << 32) & 0x0fffffff00000000) | (long) fd;
+
+  CAMLreturn (Val_int(ffd));
 }
diff --git a/tmp/foo.ml b/tmp/foo.ml
new file mode 100644
index 0000000000..6ce5740104
--- /dev/null
+++ b/tmp/foo.ml
@@ -0,0 +1,15 @@
+(* ocamlc -I +unix unix.cma -o foo foo.ml *)
+
+let write fd msg =
+  Unix.write fd (Bytes.unsafe_of_string msg) 0 (String.length msg) |> ignore
+
+let main () =
+  let fd_a = Unix.openfile "a.txt" Unix.[O_CREAT; O_RDWR] 0o666 in
+  write fd_a "aaa";
+  Unix.close fd_a;
+
+  let _fd_b = Unix.openfile "b.txt" Unix.[O_CREAT; O_RDWR] 0o666 in
+  write fd_a "aaa"; (* use after close, this goes into b *)
+  Unix.close fd_a (* this closes b *)
+
+let () = main ()
8 Likes

which uses Obj.magic and makes assumptions that a Unix file descriptor is an integer in the OCaml representation. This new representation would likewise break other code that does this.

That would break a large number of libraries. Using Obj.magic to convert between int and file_descr is the officially recommended way to do it: see Impossible to Unix.dup2 to a specific number or to inherit Unix_file_descr by number on startup · Issue #6948 · ocaml/ocaml · GitHub

If you’re only interested in Unix, you might just as well keep using the Obj.magic trick. as it is likely to keep working in the foreseeable future. It would be prudent to check Sys.os_type in an assertion, though.

It’s very popular, e.g. sherlocode file_descr.*Obj.magic.

1 Like

I accept that a widespread use of Obj.magic for FDs disqualifies this approach. But it also suggests that the type exposed by the Unix library should be changed to int since it will be impossible to mean anything else because of this.

Unfortunately, Unix.file_descr is not an int on Windows (yes, all the code that assumes that it is an int cannot work under Windows).

Cheers,
Nicolas

I don’t think this solves the problem, anyway. It is still possible for the same file to be opened a second time. The effect is not going to be any more pleasant that way.

And there’s a straightforward approach to mitigate the problem with a combinator:

let frobnicate fd =
   Unix.write fd ...
   
let closing (frob: int -> unit) name modes create_modes =
   let fd = Unix.openfile name modes create_modes in
      begin frob fd ; Unix.close fd end

let main () =
    ...
   closing frobnicate "a.txt" Unix.[O_CREAT; O_RDWR] 0o666

Functions like with_file and similar are a good way to manage resources but not all scenarios fit the pattern. My goal was to detect problems in existing code.

With the Windows implementation of the Unix library, it would be easy to achieve what @lindig is after: the type Unix.file_descr is a heap-allocated record with flags plus a Win32 handle, so it would be easy to mark the record as “closed” in Unix.close, and check for this mark in all functions.

We could easily do the same in the Unix implementation, by implementing Unix.file_descr as a record containing flags plus an integer file descriptor. This would guard against reuse-after-close-by-Unix.close, but not -after-close-by-other-C-code, however.

But this discussion is moot because, despite the Unix.file_descr type having been abstract since day 1 of the Unix library, circa 1992, there are still many users who think they know better and “officially recommend” to use Obj.magic to go from Unix.file_descr to int.

I would not say that this is an “official” recommendation, as I have argued against this practice until I was blue in the face, but to no effect: some users just know better and cannot resist shooting themselves in the foot.

That’s why we cannot have nice things. Sorry.

8 Likes

This won’t work reliably with multiple domains, though, because there will be a time after checking the flag and before the syscall executes when another domain could close it and open something else. It can also be a problem even with a single domain when using io_uring, because several operations on an FD could be enqueued and then the FD closed directly. In Eio we use referenced counted FDs to avoid that problem (the counting is lock-free because FDs sometimes need to be used in signal handlers).

I’m happy to use a different approach than Obj.magic if there is one. I gathered from this and other comments that this is the only way to do it.

Some examples of places where this is needed:

  1. When writing a network service, typically the service manager creates the socket itself and when your service is started it inherits it (see e.g. sd_listen_fds). You need to get e.g. FD 3 as a Unix.file_descr before you can mark it close-on-exec and start using it.

  2. Graphical applications needs to connect to a display server. For Wayland applications:

    If WAYLAND_SOCKET is set, interpret it as a file descriptor number on which the connection is already established, assuming that the parent process configured the connection for us. (Wire protocol basics - The Wayland Protocol)

  3. When writing a C stub that does anything with an FD you need to know that it’s an integer. For example, Unix.select only works on low-numbered FDs, so most services need to use a wrapper around poll or similar.

That could at least suggest that this conversion should be provided by the API rather than using Obj.magic.

2 Likes

I agree with @lindig - all of those conversions can be done via C stubs, and there’s some context which suggests that adding the necessary functions to the C API of the Unix library would be accepted upstream (see Add Unix.descr_of_fd and Unix.descr_of_os by dra27 · Pull Request #1990 · ocaml/ocaml · GitHub and a request in Unified API to grab a file descriptor in unixsupport.h · Issue #5249 · ocaml/ocaml · GitHub to provide a public API for a uniform way of getting the fd, when defined, for a file_descr on both Windows and Unix). On the OCaml side, there’s a stalebot-closed, but agreed, idea to have the ability to print fd’s (e.g. for debugging/logging).

4 Likes