Compact or succinct data structures in OCaml?


#1

I’m looking for implementation of compact/succinct data structures in OCaml, such as marisa trie. My purpose is to use them in Camomile, to represent character encodings, but there are many uses of such data structures in NLP, information retrieval etc…

If not, I will treat a drink (coffee, beer, Champagne, apple juice etc) for you if you implement and publish it at opam :smiley:

For those who do not know compact/succinct data structures, compact/succinct data structures are data structures which are stored in RAM as a compressed state but sill allow fast queries. There are compact versions for list, binary tree, trie etc…


#2

Are Patricia trees such a thing?
https://www.lri.fr/~filliatr/software.en.html has some.
Radix trees maybe?
I’m sure we have some in OCaml somewhere.


#3

Are Patricia trees such a thing?
https://www.lri.fr/~filliatr/software.en.html1 has some.
Radix trees maybe?

I’m not an expert but I think usually no. Succinct or compact data structures are much more advanced and achieve (nearly) information-theoretically optimal space efficiency. https://en.wikipedia.org/wiki/Succinct_data_structure

But, yes, I’m interested in using Patricia trie.


#4

I found this post about Succinct Data Structures very clear about the matter. The idea is to compress the in-memory representation of a data structure, say a patricia tree, into two arrays: a first one with the core content and a second bit-array used as an index.