Adding Dynamic Arrays / Vectors to StdLib

As someone who was also nerd-sniped by this question, I recognize the crave and I thank you for the comparison with other programming languages, many of those I had not checked and the comparison is interesting.

I think that you can actually do something related to this to_seq_slice construction you have in mind with Dynarray.to_seq a |> Seq.take (Dynarray.length a). This sequence will stop early if elements are deleted, but not grow if elements are added. (One can of course use Dynarray.to_array a for this, but I supposed you wanted to avoid the intermediate copy.)

1 Like

This is a good topic, in fact I am not convinced by the difference between to_seq and other iterators.

val to_seq : 'a t -> 'a Seq.t
(** [to_seq a] is the sequence of elements
    [get a 0], [get a 1]... [get a (length a - 1)].

    Because sequences are computed on-demand, we have to assume that
    the array may be modified in the meantime, and give a precise
    specification for this case below.

Especially because the data flow is less obvious with sequences, I would expect an exception (=programming error) if the structure of the dynarray changes during iteration.

I could consider providing both versions if we can find good names for both.

My idea is more about discouraging the footgunny one than letting users choose. What usages do you have in mind for the current specification?

I fixed the CCVector.to_seq behavior to match to_iter. It’s easier and cleaner to capture the current length and array initially anyway.

Ah, smart. I always forget that defining a non-terminating sequence is not necessarily a big deal, you don’t have to read it all.

Ok that makes sense, I did some tests a few days back with some old version, and again today with the latest version, but I didn’t double check to_seq, only iter.

I am not sure where you are reading that, but the behavior has always been static, even in the original proposal from 2005: “Notice that the end iterator is only calculated once.” (Proposal for new for-loop)

That is just a lucky accident of existing implementations. The C++ standard is quite clear that this is undefined behavior: “If no reallocation happens, then references, pointers, and iterators before the insertion point remain valid but those at or after the insertion point, including the past-the-end iterator, are invalidated.” You cannot keep using the original value of arr.end(), even if the capacity was reserved.

Hi everyone,

A short note: we eventually manage to build consensus on https://github.com/ocaml/ocaml/pull/11882 and I merged it yesterday, so there will be a Dynarray module in the standard library in OCaml 5.2. Thanks @BastouP411 and @c-cube for their efforts bringing us to this point, and to the many reviewers of the latest iteration of the PR (in particular @314eter, @c-cube, @clef-men, @damiendoligez, @dbuenzli, @gadmm, @Octachron and @wiktor).

The implementation uses an indirection to get a type-safe “empty” value to use in unused elements of the support array. Hopefully we can improve on that with a safe magical implementation in the short future – but I think that the performance is good enough already.

24 Likes

Just want to say thanks everyone for the hard work on getting this merged :tada:. Sometimes the “obvious” improvements to a language are the hardest to get consensus on, but I’m glad this got done!

1 Like

Hopefully we can improve on that with a safe magical implementation in the short future – but I think that the performance is good enough already.

I remember the second half being discussed in the thread and I do not think that this conclusion held up. In particular it is very difficult to have an experiment that is relevant wrt cache-locality effects using micro-benchmarks.

2 Likes

We did discuss this, and I ran measurements on the effect of boxing/indirection on a real-world dynarray-intensive program: https://github.com/ocaml/ocaml/pull/11882#issuecomment-1405867624 .

5 Likes

This was a good experiment. Thank you for your prolonged efforts and for pushing this to the finish line!

6 Likes

Excited to experiment with it! Thank you all for your hard work :slight_smile:

I do enjoy the irony that one of the main reason the very first proposal (the one that led to the creation of this discuss thread) was rejected was because of the indirection. (I know that the new decision makes sense for different reasons (OCaml 5 etc), I just find it funny)

1 Like