Compact, memory-contiguous data structures

Sometimes, I wonder if many functional data structures used in OCaml
could not have a [compact] method (lists, sets, maps, etc).

This method would encode the data-structure in a contiguous memory region,
so that the data-structure has more chance to fit into the CPU cache.

Of course, once you have called the [compact] method, you are not supposed
to modify the data-structure anymore (or, if you do so then the data structure
would become “uncompact” again, as it was before).

Some usage example:

  • you construct a set once
  • then many times you want to lookup elements for membership in that set
2 Likes

maybe @backtracking might have ideas on this

The [compact] method doesn’t need to be a function of the data structure, but rather, it can be a function of the whole program.
We can achieve that by incorporating locality information in the [Gc.compact] algorithm. I believe I saw some papers about GCs that can do something like this, but I can’t find them right now.
The advantage here is that we don’t need to modify the data structures at all.

However, it’s not all perfect:

  1. We still have to keep around the header words – so we don’t achieve maximum compaction
  2. If we have something like an n-ary tree, there are several possible layout choices, and we would have to have a heuristic to pick one, rather than a data-structure specific way.

But! I wonder how much it actually matters for the data structure to be physically continuous in memory. Rather, we may want to look at the utilization of cache lines. If we have lots of “gaps”, i.e. individual blocks whose length is not a multiple of the cache line size, then we’re wasting cache memory. So if we design the data structure to be composed of blocks that are a multiple of the cache line size, then we will achieve perfect utilization of the cache (provided we’re aligned to cache lines, probably including the header word).
In that case, I think we should have equivalent or better performance than contiguous layouts.

1 Like

I don’t know their performance characteristics, but can recall hearing about succint tree (set/map) representastions from this link which is kind of relevant because compact means taking less space. https://tupl.cs.tufts.edu/papers/1904.02809.pdf

There’s also an interesting article on making list traversal 91% faster, not by changing the memory layout (although memory layout is important for it), but through a “value speculation” trick which transforms the list iteration to be like an array iteration. https://www.lortex.org/articles/value-speculation-ocaml/

1 Like

If what we need is an iterator, then all we really need is to lay out the data stored in the structure in the arrays, that are harmonious with the memory block sizes. Then again, why not design the structures against storing items in the arrays in the first place? That is, the sets and maps should just be indexing items in an array. The array itself could be exposed through a trivial iterator. Of course, we’d benefit more from using a collection of arrays of the right size. The arrays themselves may contain the data directly, and perhaps incorporate some weak links in the representation.

Is there a huge performance penalty when taking such approach?

Also, how does the array of weak references work in OCaml? Do they store data blocks directly, or just some heap values?