ANN: Releases of ringo

On behalf of Nomadic Labs, I am please to announce the first few releases of Ringo: a library for caches. Ringo offers two kinds of caches: Maps for caches of key-value pairs and Sets for caches of simple elements. In addition, each kind of cache can be tweaked to handle their bounds differently.

Ringo versions 0.1, 0.2 and 0.3 are available on opam. As the version number and the bundled announce suggests, this library is still in early phases of release: additional replacement policies will be added, the interface will probably change somewhat, etc. Suggestions welcome!

Even though the interface is still in early phases of release, the implementation is covered by a lot of tests and is already in use in the Tezos project.

The code is available at https://gitlab.com/nomadic-labs/ringo

8 Likes

Version 0.4 of ringo is now available in opam. This version includes bug-fixes, minor (sometimes breaking) interface and semantics improvements, and, most importantly, a ringo-lwt package.

ringo-lwt provides wrapper for using caches in an Lwt-heavy application. Specifically, it provides a functor that transform a Ringo cache into a Ringo-lwt cache featuring:

  • val find_or_replace : 'a t -> key -> (key -> 'a Lwt.t) -> 'a Lwt.t which helps avoid race conditions,
  • automatic cleanup by which promises that are rejected are removed from the table automatically.

Additional functors for option (with automatic cleanup of None) and result (with automatic cleanup of Error) are also provided.

2 Likes

Version 0.5 of ringo and ringo-lwt are now available in opam. Although this version changes ringo-lwt only, both packages are released anew to keep the version numbers in sync. This version includes:

  • Improvement in documentation.
  • Simplifications and reduction in the memory footprint of lwt-wrapped caches.
  • Fix for a race condition in the automatic cleanup (previously, on weak caches only, a promise being rejected could cause a different promise to be removed from the cache)
  • Fix a leak
  • More test, including a test for leakiness.
2 Likes

What is ringo?
Caches (bounded-size key-value stores) and other bounded-size stores says opam.
When advertizing a software, you should always remind what this software is about.
99.9999% of the people know nothing about what you are talking/advertizing…

And by the way, what are the uses cases for ringo?

Are there any alternatives to it?

Thanks,
F.

2 Likes

Ringo provides bounded-size key-value stores. More specifically, it provides a functor similar to Hastbl.Make except that the number of bindings held by the tables is limited: inserting additional bindings when the limit has been reached causes some previously inserted binding to be removed.

More more specifically, Ringo provides a function map_maker that takes parameters to customise the policies that determine the behaviour of the cache when supernumerary bindings are inserted, and returns the functor described above. Once a module Cache is instantiated using this functor, it can be used as follows:

let cache = Cache.create size
let fetch_data uri =
  match Cache.find_opt cache uri with
  | Some data -> data
  | None ->
    let data = really_fetch_data uri in
    Cache.replace cache uri data;
    data

The cache will only hold up to [size] bindings, which avoids leaking memory. Additionally, the parameters for map_maker allow you to customise:

  • The replacement policy: which binding is removed when a supernumerary is inserted (currently supports least-recently used and first-in first-out).
  • The overflow policy: whether the cache can weakly hold some supernumerary elements (if so, the cache may hold more but the GC can always collect them if space is lacking).
  • The accounting precision: whether to keep precise track of removed/replaced elements.

In addition, Ringo also provide set-caches: i.e., sets (rather than maps) with bounded size and all the same properties as above.

Also note Ringo-Lwt (ringo-lwt) provides Lwt wrappers around Ringo caches.


If you have suggestions for a different concise synopsis for opam, feel free to send them this way.


Use cases are, I guess, caches. In particular those that might receive many elements not all of which you can hold in memory. We use it in a few places in the Tezos project to hold resources (blocks, operations, etc.) that are fetched from the P2p layer: it avoids having to fetch them again from the network.

I think anycache, lru, and lru-cache are all alternatives available on opam.

13 Likes

The documentation is now available online at https://nomadic-labs.gitlab.io/ringo/index.html

Of particular interest:

5 Likes

Hi, I don’t have a GitLab account, so I hope it’s OK to ask here: would it be possible to add an ‘eviction hook’ to ringo’s cache? E.g. LRU cache is full, we need to evict an item to make room for a new one. Before evicting it, we need to do some cleanup e.g. maybe close a file. It would be useful to have a hook provided at time of creating the cache, which could be applied to the entry before it is evicted.

2 Likes

Hi,
I have discussed this feature with some users of the library. The main issue is regarding weak caches (caches with weak pointers). In such a cache there are two distinct notion of being evicted:

  • When the cache has reached the set size limit, then one of the item is moved from the strongly-held set of items into the weakly-held set of items. This is an eviction in the sense that it is the process by which the cache is of limited size.

  • When the GC kicks in at some point, the weakly-held items may be collected. This is an eviction in the sense that prior to that point you could retrieve the item and that after that point you cannot. I think of this one as the true eviction: it’s the time when the item actually becomes unavailable.

From there we had open design and implementation questions which we did not resolve. Maybe you have some interesting input.

What’s your use case for the eviction callback?
Should we have eviction callbacks for Strong caches only?
Should we have two distinct eviction callbacks for the two distinct notions of eviction in the weak caches? Should we have only one and which one?
Is it ok that, in a weak cache, you may end up calling many of the eviction callbacks essentially at the same time (at collection time)?

I’m also unsure about some implementation details regarding how to attach GC finaliser to elements in an Ephemeron table. I need a refresher on that! Anyone has an explanatory, example-driven, tutorial-like bit of documentation to recommend?

And something I’m concerned about is potentially adding a significant use of GC finaliser in the context of a multicore program. What would this do? Any recommendations for how to handle this?

1 Like

Hi, thanks for explaining the issue, it’s more complex than I had thought at first. Actually my use case is fairly simple, but slightly tricky because of resource management. In the end I decided to homebrew a solution and got rid of a dependency.

For the curious, my use-case is managing a cache of open SQLite databases, mapped from their names to the Sqlite3.db values. I need to be able to treat this as an LRU cache and remove entries from the cache at any time and safely close the DBs.

You may want to have a look at how it’s done in this library[1] it does exactly that. I don’t think you really need what you want, just define a larger db handle structure and do your cleanup business when you remove stuff from the cache and/or close your db handle.


  1. I’m not mentioning the name and it’s not mirrored on github yet because the name has to change. Between when I started and now, someone took the name in opam for something else and I’ve been procastinating on making the painful change. ↩︎

1 Like

I’m guessing you don’t use a weak cache because you are using the cache limit to limit non-memory resource usage. And so I’m guessing that a solution focused on strong caches would suffice.

I’ll keep that in mind if I revisit the issue. And I’ll also update the doc to specifically mention memory vs other resources and discourage the use of ringo’s weak caches for managing other resources.

1 Like

Interesting, thank you. Just to clarify something, I need to be able to remove items in random positions in the cache, not just the LRU item. So, instead of a custom-defined structure to track the LRU, I’m using an in-memory SQLite table. When I insert rows (database name) into it, SQLite automatically assigns them rowids, in ascending order. So the lowest rowid in the table at any point is the LRU item. And removing an item from a random position is simple because the table rows are indexed by the database name column. Basically, I’m letting SQLite do the bookkeeping for me.