Closing file descriptors via GC finaliser?

lindig · August 2, 2017, 2:43pm

Using C, one can register functions that the GC will call before collecting an object. In theory it should be possible to create a library that uses that to close file descriptors automatically when they become unreachable. This is not a new idea but I would be curious why it’s such a bad idea that I don’t see any implementation of it.

avsm · August 2, 2017, 2:57pm

It’s generally better to explicitly control the lifetime of external resources, since you have to decide what to do under resource starvation conditions. If you run out of fds, then you are forced to trigger a (probably unnecessary) memory GC in order to get more. That in turn stalls the entire application and adds network latency.

lindig · August 2, 2017, 3:03pm

Would such a library not still be useful to catch leaks and emit warnings? The expectation would be that the file is already closed but when not, it is closed and a warning is issued?

talex5 · August 2, 2017, 3:22pm

@lindig this is what the capnp-rpc library tries to do. If a resource is GC’d without being released, a warning is logged and it tries to release it anyway. It’s surprisingly complicated, and I’m still not certain that a sufficiently smart optimising compiler might not release it too soon.

e.g. if you have a close function:

type t = { fd : int }

let close t =
  let fd = t.fd in
  Unix.close fd

What stops the compiler from GCing t after the let fd = line (when it doesn’t need t any longer)? I’d be interested to know if there’s a recommended way to do this.

lindig · August 2, 2017, 3:26pm

I think a resource that is released via GC would have to be entirely abstract to avoid the problem you just demonstrated.

mmottl · August 5, 2017, 2:18am

Using the GC on system resources is generally a bad idea. Besides the already mentioned problem of dealing with resource starvation, it also causes serious issues with error handling, e.g. what if closing the descriptor returns an error? It can be extremely hard to track down such bugs if they happen at some random time during finalization, because the error context is lost. Another problem is that system calls could conceivably block or take an inordinate amount of time before they return. Freezing the GC is not something that should ever happen.

@talex5 I once needed a trick to prevent a value from being garbage collected even in the presence of compiler optimizations (Flambda): just introduce an external dummy function that takes the block that should not be finalized and make sure it’s called after the block has been properly used, e.g.:

--- OCaml
external gc_keep_until_now : _ -> unit = "gc_keep_until_now" [@@noalloc]

--- C
CAMLprim value gc_keep_until_now(value __unused v) { return Val_unit; }

The compiler cannot know whether the external function actually uses the contents of the block so it cannot optimize the call away.

talex5 · August 5, 2017, 10:01am

Yeah, my code currently uses Sys.opaque_identity. Hopefully that will do the trick.

mmottl · August 5, 2017, 2:45pm

Ah, thanks, that should do the trick, too, and would be more efficient, because it can be safely optimized away. I didn’t know this function was already introduced in the standard library.

Topic		Replies	Views
Downsides to calling Gc.full_major at exit? Learning bindings	15	1689	April 25, 2020
Is there a way to turn off garbage collection inside of one function? Learning gc	15	2421	April 27, 2020
OCaml 5: forcing objects to be collected and finalized Learning gc , ocaml5	4	642	June 27, 2023
Support for clearing GC'ed memory? Learning runtime	18	1076	January 1, 2023
OCaml heap "fsck" and forcing collection of unreachable objects Community gc	3	500	May 31, 2023

Closing file descriptors via GC finaliser?

Related topics