Wrapping C++ std::shared_ptr and similar smart pointers

,

Hello everyone!

I’m trying to create Reason/OCaml bindings for the Skia 2D graphics library. The library makes heavy use of smart pointers similar to std::shared_ptrs, but they called them sk_sp.

Now my first idea for wrapping these was using a regular pointer that points to a shared pointer. That would trigger the release of the memory behind the shared pointer as soon as the local variable containing the shared pointer goes out of scope though.

I found a solution that looked promising to me in https://github.com/ygrek/scraps/blob/master/cxx_wrapped.h but now I heard that reducing the refcount in the finalizer is also not a good idea. Unfortunately I don’t know why that is not a good idea and I also don’t have a better one.

Can anyone help me understand this better and point me towards a better approach?

2 Likes

(Sorry for the delay as I have been busy.)

It all comes down to the fact that tracing and reference counting have different advantages and drawbacks, and the main difference for this question is that RC reclaims promptly, whereas tracing does not reclaim predictably; in addition OCaml is currently poor in terms of predictable resource management.

Smart pointers can be used to manage resources other than memory. (I mean smart pointers that implement deterministic reclamation of resources such as unique or reference-counted pointers; in principle smart pointers are not restricted in what they implement: delayed evaluation, roots for tracing GCs… such exotic pointers are out of the scope of my answer.)

First, you need to determine whether the pointer manages non-memory resources (the destruction closes a file, releases a lock, rolls back some state…). If so, using finalizers is a no-go, because you cannot predict when and in which order finalizers run, and in practice it can be way too late. When that is the case, skip 1). For instance I see that your library has some functions that return RAII guards; quite obviously these cannot be handled with finalizers.

1) Custom blocks with finalizer

If the smart pointer only manages memory, then it is possible to represent it with a custom block with a finalizer attached to it. The GC needs to know the size of what it manages, otherwise it will not work hard enough to reclaim memory and you can end up with a memory leak. This has occasionally been called “the familiar “allocation of custom objects mess up the speed of the major GC” problem”.

The situation is supposed to improve in OCaml 4.08, which introduces a new function caml_alloc_custom_mem that lets you specify the size of the memory managed by the custom block, which the GC’s heuristics will take into account. (caml_alloc_custom also has parameters to tweak the GC speed but presumably this was not good enough as witnessed by the multiple bug reports referenced in that PR.)

So you can use as a source of inspiration @ygrek’s wrapped pointer you have linked to above, but you must adapt it to tell the OCaml GC the size of the data your custom block contains.

Pros:

  • Expressive: the foreign data is abstracted as an OCaml value that can be passed around, inserted into data structures, etc.

Cons:

  • No-go for non-memory resources.

  • You need to know the size of what you are managing—there is no universal smart pointer wrapper!

  • Not so good for performance/scale or interoperability. Mixing tracing and RC cumulates the drawbacks of both; in particular you inherit the possible unbounded latency due to the upfront deallocation cost of RC (depending on your use-case), and you are even at a risk of creating cycles that are never collected if you mix this method with that one to store OCaml values on the foreign side.

These are some guaranteed theoretical drawbacks, but I imagine that there can be more practical implementation-specific issues (as witnessed by caml_alloc_custom vs caml_alloc_custom_mem). I do not have hands-on experience with custom blocks, and while researching for this answer, I found this usage not very well documented, so I hope that experts can fill-in the gaps and/or correct the above if needed.

2) Deterministic resource management

To avoid the impedance mismatch between smart pointers and the GC, you can rely on deterministic resource management. In OCaml, the idiomatic expression of it is to use “with_” wrappers based on unwind-protect [see the example of files]. OCaml 4.08 introduces Fun.protect, an implementation of unwind-protect suitable for OCaml.

Pros:

  • Predictable: can be used for non-memory resources.

Cons:

  • Lacks expressiveness: resources live for the exact duration of their defining scope, and are reclaimed in LIFO order.

  • Allows “use after free”: the resource can be referenced outside of its scope, if not careful.

  • Currently incompatible with asynchronous exceptions: OCaml does not currently allow an implementation of unwind-protect that protects from asynchronous exceptions being raised inside the finally clause.

3) Manual resource management

If neither 1) nor 2) fit the bill, you have to resort to manual resource management, in which the user has to call some free function explicitly (and gets an exception if they use it after free). It is “hard” to program correctly with manual resource management, moreso in the presence of exceptions. For this reason, people mix it with 1) and/or 2); for instance they use unwind-protect in a non-systematic manner, or they attach finalizers to act as a fallback, or both. While with 1) and 2) you are still within the realm of structured programming, with manual resource management you enter the realm of debugging-oriented programming—think programming in a weird dialect of old C++.

Pros:

  • Last resort solution

Cons:

  • Non-idiomatic code

  • Hard to program

  • Hard to reason about the code

Discussions with Serious Industrial OCaml Users a while ago (starting around POPL 2017 in Paris) have let appear OCaml’s current issues with resource management. These discussion prompted a proposal for a resource management model for OCaml, inspired by RAII and move semantics from modern C++/Rust. In a nutshell, it aims to lift the expressiveness limitations of 2). Interoperability is probably its most important application.

3 Likes

Thank you so much for the comprehensive response, @gadmm!

I think I have a clear picture of what to do now.

I will extend the wrapper structures written by @ygrek to support proper usage of the heuristics using caml_alloc_custom_mem. For one case where a smart pointer actually wraps a file handle, I will need to figure out a way to apply the second approach. That should make sure that the file isn’t unnecessarily blocked beyond its usage.

On a completely different note: I found a purely pointer-based experimental C API in skia as well. In the long term, I will probably investigate if it makes sense to use that in combination with regular custom objects instead. If it supports all required functionality, that is.
That should yield the best performance, shouldn’t it?

For naked pointers that own their resource (i.e. gives you the responsibility to dispose of it), there is nothing to change to my reply. It is not specific to smart pointers. So the same caveats apply. Besides, I doubt that performance is a factor in your case. You might find other advantages to relying on a more stable C API, though.