Closing file descriptors via GC finaliser?



Using C, one can register functions that the GC will call before collecting an object. In theory it should be possible to create a library that uses that to close file descriptors automatically when they become unreachable. This is not a new idea but I would be curious why it’s such a bad idea that I don’t see any implementation of it.


It’s generally better to explicitly control the lifetime of external resources, since you have to decide what to do under resource starvation conditions. If you run out of fds, then you are forced to trigger a (probably unnecessary) memory GC in order to get more. That in turn stalls the entire application and adds network latency.


Would such a library not still be useful to catch leaks and emit warnings? The expectation would be that the file is already closed but when not, it is closed and a warning is issued?


@lindig this is what the capnp-rpc library tries to do. If a resource is GC’d without being released, a warning is logged and it tries to release it anyway. It’s surprisingly complicated, and I’m still not certain that a sufficiently smart optimising compiler might not release it too soon.

e.g. if you have a close function:

type t = { fd : int }

let close t =
  let fd = t.fd in
  Unix.close fd

What stops the compiler from GCing t after the let fd = line (when it doesn’t need t any longer)? I’d be interested to know if there’s a recommended way to do this.


I think a resource that is released via GC would have to be entirely abstract to avoid the problem you just demonstrated.


Using the GC on system resources is generally a bad idea. Besides the already mentioned problem of dealing with resource starvation, it also causes serious issues with error handling, e.g. what if closing the descriptor returns an error? It can be extremely hard to track down such bugs if they happen at some random time during finalization, because the error context is lost. Another problem is that system calls could conceivably block or take an inordinate amount of time before they return. Freezing the GC is not something that should ever happen.

@talex5 I once needed a trick to prevent a value from being garbage collected even in the presence of compiler optimizations (Flambda): just introduce an external dummy function that takes the block that should not be finalized and make sure it’s called after the block has been properly used, e.g.:

--- OCaml
external gc_keep_until_now : _ -> unit = "gc_keep_until_now" [@@noalloc]

--- C
CAMLprim value gc_keep_until_now(value __unused v) { return Val_unit; }

The compiler cannot know whether the external function actually uses the contents of the block so it cannot optimize the call away.


Yeah, my code currently uses Sys.opaque_identity. Hopefully that will do the trick.


Ah, thanks, that should do the trick, too, and would be more efficient, because it can be safely optimized away. I didn’t know this function was already introduced in the standard library.