Wrapping C++ std::shared_ptr and similar smart pointers

gadmm · April 17, 2019, 5:56am

(Sorry for the delay as I have been busy.)

It all comes down to the fact that tracing and reference counting have different advantages and drawbacks, and the main difference for this question is that RC reclaims promptly, whereas tracing does not reclaim predictably; in addition OCaml is currently poor in terms of predictable resource management.

Smart pointers can be used to manage resources other than memory. (I mean smart pointers that implement deterministic reclamation of resources such as unique or reference-counted pointers; in principle smart pointers are not restricted in what they implement: delayed evaluation, roots for tracing GCs… such exotic pointers are out of the scope of my answer.)

First, you need to determine whether the pointer manages non-memory resources (the destruction closes a file, releases a lock, rolls back some state…). If so, using finalizers is a no-go, because you cannot predict when and in which order finalizers run, and in practice it can be way too late. When that is the case, skip 1). For instance I see that your library has some functions that return RAII guards; quite obviously these cannot be handled with finalizers.

1) Custom blocks with finalizer

If the smart pointer only manages memory, then it is possible to represent it with a custom block with a finalizer attached to it. The GC needs to know the size of what it manages, otherwise it will not work hard enough to reclaim memory and you can end up with a memory leak. This has occasionally been called “the familiar “allocation of custom objects mess up the speed of the major GC” problem”.

The situation is supposed to improve in OCaml 4.08, which introduces a new function caml_alloc_custom_mem that lets you specify the size of the memory managed by the custom block, which the GC’s heuristics will take into account. (caml_alloc_custom also has parameters to tweak the GC speed but presumably this was not good enough as witnessed by the multiple bug reports referenced in that PR.)

So you can use as a source of inspiration @ygrek’s wrapped pointer you have linked to above, but you must adapt it to tell the OCaml GC the size of the data your custom block contains.

Pros:

Expressive: the foreign data is abstracted as an OCaml value that can be passed around, inserted into data structures, etc.

Cons:

No-go for non-memory resources.
You need to know the size of what you are managing—there is no universal smart pointer wrapper!
Not so good for performance/scale or interoperability. Mixing tracing and RC cumulates the drawbacks of both; in particular you inherit the possible unbounded latency due to the upfront deallocation cost of RC (depending on your use-case), and you are even at a risk of creating cycles that are never collected if you mix this method with that one to store OCaml values on the foreign side.

These are some guaranteed theoretical drawbacks, but I imagine that there can be more practical implementation-specific issues (as witnessed by caml_alloc_custom vs caml_alloc_custom_mem). I do not have hands-on experience with custom blocks, and while researching for this answer, I found this usage not very well documented, so I hope that experts can fill-in the gaps and/or correct the above if needed.

2) Deterministic resource management

To avoid the impedance mismatch between smart pointers and the GC, you can rely on deterministic resource management. In OCaml, the idiomatic expression of it is to use “with_” wrappers based on unwind-protect [see the example of files]. OCaml 4.08 introduces Fun.protect, an implementation of unwind-protect suitable for OCaml.

Pros:

Predictable: can be used for non-memory resources.

Cons:

Lacks expressiveness: resources live for the exact duration of their defining scope, and are reclaimed in LIFO order.
Allows “use after free”: the resource can be referenced outside of its scope, if not careful.
Currently incompatible with asynchronous exceptions: OCaml does not currently allow an implementation of unwind-protect that protects from asynchronous exceptions being raised inside the finally clause.

3) Manual resource management

If neither 1) nor 2) fit the bill, you have to resort to manual resource management, in which the user has to call some free function explicitly (and gets an exception if they use it after free). It is “hard” to program correctly with manual resource management, moreso in the presence of exceptions. For this reason, people mix it with 1) and/or 2); for instance they use unwind-protect in a non-systematic manner, or they attach finalizers to act as a fallback, or both. While with 1) and 2) you are still within the realm of structured programming, with manual resource management you enter the realm of debugging-oriented programming—think programming in a weird dialect of old C++.

Pros:

Last resort solution

Cons:

Non-idiomatic code
Hard to program
Hard to reason about the code

Discussions with Serious Industrial OCaml Users a while ago (starting around POPL 2017 in Paris) have let appear OCaml’s current issues with resource management. These discussion prompted a proposal for a resource management model for OCaml, inspired by RAII and move semantics from modern C++/Rust. In a nutshell, it aims to lift the expressiveness limitations of 2). Interoperability is probably its most important application.

Topic		Replies	Views
A proposal for a resource-management model for OCaml Community	29	7328	January 1, 2021
Downsides to calling Gc.full_major at exit? Learning bindings	15	1689	April 25, 2020
[ANN] A dynamic checker for detecting naked pointers Ecosystem multicore , compiler	36	5742	May 28, 2020
How to keep OCaml races bounded in space and time in the presence of C stubs? Learning multicore , gc	3	731	February 16, 2023
C pointers on OCaml side and checks for null Learning ffi	1	203	February 2, 2025

Wrapping C++ std::shared_ptr and similar smart pointers

1) Custom blocks with finalizer

2) Deterministic resource management

3) Manual resource management

Related topics