I’d like to understand Ephemerons better. They are designed to implement caches whose values can be collected by the GC when no longer used and have been added to the OCaml standard library.
So far, I would use the Weak module for this as follows: When a program deals with a lot of identical strings it makes sense to store only one unique copy as it safes space and it speeds up comparisons. Below is module Atom that stores strings in a weak hash table and ensures that each string only exists once. A string that is not referenced from outside the hash table can still be GC’ed.
module Atom : sig
val string: string -> string
end = struct
module S = struct
include String
let hash = Hashtbl.hash
end
module H = Weak.Make(S)
let atoms = H.create 29
let string str =
try H.find atoms str
with Not_found ->
H.add atoms str; str
end
Would there be an advantage to use Ephemerons or when and how would I use them instead?
It’s useful if you want to do things like hash consing. See here, there’s an example in the test folder. If you want to achieve something similar without Ephemerons, it would look like this. Because you don’t need a weak hashset, but a weak hash table (that is, an Ephemeron with one key).
To understand the example of hash consing, see this paper.
EDIT: the wikipedia’s page of Ephemeron also explains some differences and use cases.
What’s happening here is that I’m using an ‘ephemeral cache’ (i.e. a cache backed by an ephemeron hash table, here) to store subscribers to a ‘topic’, i.e. a pub-sub bus. You get a subscription token when you subscribe to a topic, and part of that token is the cache key. The cache is ‘ephemeral’ so as soon as the subscription token goes out of scope, it and its corresponding subscription (concretely, the stream and its push function) are automatically deleted from the cache.
Hence, there’s no ‘unsubscribe’ or ‘close topic’ functionality–it’s assumed that you want to unsubscribe if you let the subscription token go out of scope.