Scala Vector-like structure in OCaml

datastructures

#1

Hello everyone! I’m new to Ocaml and I was wondering if it has any data structure similar to Scala’s immutable vector: http://www.scala-lang.org/api/2.12.0/scala/collection/immutable/Vector.html (idea behind it here: https://infoscience.epfl.ch/record/169879/files/RMTrees.pdf). It’s basically a more powerful List, when you can append at both ends, concatenate, split, or change elements, all in O(logN). In Scala I was using it a lot and find it much more useful than Lists. I couldn’t find anything similar in ocaml, but maybe I wasn’t searching for the right keywords. External libraries would also be useful. Thanks!


#2

I don’t know whether this exists, but I’d bet that there is something similar. If you haven’t already done so, you should look in Containers, Batteries Included, and Core or Core_kernel. There might also be other, smaller libraries in opam that provide the kind of data structure you want.


#3

There’s nothing like that in containers. I feel like it’s missing, but I could probably adapt the weird-ish hash trie to work exclusively with consecutive integer keys. I’ll think about it…


#4

Hi, it looks like the Batteries Included library indeed has a vector implementation: http://ocaml-batteries-team.github.io/batteries-included/hdoc2/BatVect.html

There is also a mutable, array-backed vector available at https://github.com/gsg/ocaml-vector


#5

Thanks a lot for your replies. It seems like the BatVec is so far the closest one to what I was looking for.


#6

Does Scala currently use RRB trees, or just the radix balanced trees from the background of that paper? The description of the standard library immutable vectors here doesn’t mention faster concatenations, and the implementation here makes it sound like this isn’t the current stdlib vector implementation.


#7

The immutable-re library has a vector, but it only works with OCaml < 4.05.


#8

So I started writing something based on trees with up to 32 values and 32 children. It should be called Fun_vec and should provide roughly the same interface as a mutable vector (fast append on the right, fast lookup).

first benchmark (comparing it to RAL which is based on Okasaki’s “purely functional random access lists”, i.e. a list of complete binary trees of increasing size):
https://paste.isomorphis.me/0IK
It’s already significantly faster than RAL!


#9

Hi Simon, how does your thing compares to the batteries’ stuff?
Thanks,
Francois.


#10

And there is my implementation of RRB-trees in OCaml.


#11

The algorithm is different from BatVec’s underlying rope. I wrote a kind of HAMT-like structure where all nodes contain an array of up to 32 values, and up to 32 subnodes, and filled in increasing index order. Benchmarks with BatVect.

I expect it to be quite fast in practice for batch operations by using transient mutability to diminish the number of intermediate structures allocated. Right now it’s only a first draft.

I didn’t compare to clarity (it needs 4.04 :confused: and I’m testing on 4.03 — also no flambda because my benchmarks trigger a bug in the compiler).


#12

Seems that Daniel Bünzli is on it with pvec. That should be a high quality library!