[ANN] varray 0.2

Hello everyone,

I’m not so pleased to announce a bugfix release of the varray package on opam. This library provides an implementation of dynamic arrays, which automatically resize as elements are added or removed from the array. It’s based on the really fun paper “Tiered Vectors: Efficient Dynamic Arrays for Rank-Based Sequences” by Michael T. Goodrich and John G. Kloss II. When I first heard about it, I could not resist implementing this datastructure because its algorithmic complexities are rather fancy:

  • O(1) to get/set elements anywhere in the array
  • O(1) to add or pop an element at the front/back of the array
  • O(ᵏ√N) to add or pop an element anywhere in the middle of the array (for any k >= 1)

And because there’s a fun way of exposing the API using OCaml’s functors:

module One   = Varray.Circular   (* k=1 => O(N) complexity for insert/delete *)
module Two   = Varray.Root (One) (* k=2 => O(√N) complexity *)
module Three = Varray.Root (Two) (* k=3 => O(³√N) complexity *)

More details on the github repo and the online API documentation (which includes a teaser for odoc upcoming search feature, thanks to @EmileTrotignon, @panglesd and the wonderful odoc team :heart: )

Yet, I never publicly announced the initial release of this package on opam… because I don’t think dynamic arrays are actually useful when programming in OCaml! Using integer indices to address elements is prone to “index out of bounds” bugs, so it’s rarely the right choice. If you do have a usecase for them, I would love to hear it :slight_smile:


But ok, so, wait, why am I announcing it now? Well because @n-osborne found a bug in my code which could trigger a segfault! This terrible bug was only affecting the lesser-used delete_at operation, and so it lay dormant for two years. This was a very dumb mistake: This specific function was missing a check for out of bound indexes (ha!)

While I would prefer to avoid the public walk of shame, how the bug was discovered is too cool not to share: @n-osborne and @shym wrote a Gospel specification of how the Varray library should behave, which enabled them to stress test it and discover a counter-example where my code was misbehaving. Here’s an extract of the faulty function specification, where Gospel specs are written as special comments (*@ ... *) in the mli documentation:

type 'a t
(*@ mutable model contents : 'a sequence *) 
(* ^^^ Gospel modelization of the abstract varray type
       as a "mutable sequence called contents" *)

val delete_at : 'a t -> int -> unit
(** [delete_at t i] removes the element [t.(i)].
    Every element on the right of [i] is shifted by one to the left. *)
(*@ delete_at t i
    checks inside i t.contents
    modifies t.contents
    ensures t.contents = old (t.contents[..(i - 1)] ++ t.contents[(i + 1)..]) *)

Note the use of the keyword old on the last line, which allows the postcondition ensures to refer to the state of the input before it was imperatively modified. Formal specifications are generally reserved to specialists, but look how readable this is! (see the full varray spec)

Gospel specs can then be interpreted by a variety of tools. To discover this specific bug, @n-osborne used Ortac to automatically translate his specification into an executable QCheck-STM test to search for incoherencies between the model and the actual implementation.

Very very cool stuff. If you are looking for more resources on the subject, I found Chapter 2 of Clément Pascutto’s PhD thesis to be a very nice introduction to Gospel specifications :slight_smile: (and later chapters describe his cutting edge research for Ortac!)

36 Likes

For those speaking french and interested in Gospel/Ortac, there’s a video of Clément’s talk given at the oups meetup last year.

6 Likes