Announcing Sek, an efficient implementation of sequences

Fellow OCaml users,

We are pleased to announce the first release of Sek, an OCaml library that
offers an efficient implementation of sequences.

The library offers both ephemeral (mutable) sequences and persistent
(immutable) sequences, and offers constant-time conversions between these
flavors.

It supports all of the standard operations on stacks, queues, deques (e.g.
push, pop at either end), catenable sequences (concat, split), and random
access sequences (get, set).

Data is stored internally in chunks (fixed-capacity arrays),
which is why this data structure is known as a chunK SEquence.

It is intended to achieve excellent time complexity and memory usage.

This is an initial release. The library has not been tested in production,
but has received extensive unit testing, via afl-fuzz and ocaml+afl –
which are remarkably effective tools, by the way!

This is work in progress; more features, such as iterators, will be added
in the future.

To install Sek, just type

  opam update && opam install sek

Documentation is online.

Feedback is welcome!

Arthur Charguéraud
François Pottier
with contributions by Émilie Guermeur

8 Likes

Exciting stuff! Do you have any benchmarking to compare it to the other sequence libraries out there? I’m particularly interested in how it compares to Base.Sequence and Seq in the OCaml distribution, but surely there are others as well.

y

1 Like

Thanks!
Regarding benchmarks: we have preliminary results; they look good; we still need to complete the benchmarks and write text explaining what we are measuring exactly.
+
Arthur

This actually looks like an array/vector structure (supporting, among other things, fast access to the nth element), so a comparison with CCVector, CCFun_vec, BatVect, Clarity.Vector etc. would be more appropriate. The name is a bit unfortunate considering the naming used in the general ecosystem.

Some time ago, I added some crude benchmarks to containers’ benchsuite. I’ll see if I can add Sek when I find time.

2 Likes

I think it really is a sequence library in the sense that in maintains an in-order sequence of items, and sequences can be joined/split efficiently. It also provides logarithmic random access, but this is probably not competitive with fixed-size arrays. It would be comparable to “persistent vector” libraries, ropes, finger trees, etc. The fact that the authors expose a Stack/Queue interface suggests that it has also been tuned to perform reasonably well in this case.

It does not provide any delayed computation of items, so in that regard it is not comparable to Sequence/Seq.

@charguer has designed similar datastructures in the past to represent the work-queues of concurrent workers (you want at least a fast “push” to add a new task and, when doing work-stealing, having a fast “split” is convenient). See Theory and Practice of Chunked Sequences, Umut Acar, Arthur Charguéraud, Mike Rainey, 2014, and A Work-Efficient Algorithm for Parallel Unordered Depth-First Search.

As far as I know, the OCaml implementation just released has not been tested/benchmarked for parallel algorithms. I would be curious to see an experiment of parallel graph traversal with this structure and Multicore-OCaml.

3 Likes