Adding Dynamic Arrays / Vectors to StdLib

gasche · March 13, 2023, 9:01pm

As someone who was also nerd-sniped by this question, I recognize the crave and I thank you for the comparison with other programming languages, many of those I had not checked and the comparison is interesting.

I think that you can actually do something related to this to_seq_slice construction you have in mind with Dynarray.to_seq a |> Seq.take (Dynarray.length a). This sequence will stop early if elements are deleted, but not grow if elements are added. (One can of course use Dynarray.to_array a for this, but I supposed you wanted to avoid the intermediate copy.)

gadmm · March 13, 2023, 9:03pm

This is a good topic, in fact I am not convinced by the difference between to_seq and other iterators.

val to_seq : 'a t -> 'a Seq.t
(** [to_seq a] is the sequence of elements
    [get a 0], [get a 1]... [get a (length a - 1)].

    Because sequences are computed on-demand, we have to assume that
    the array may be modified in the meantime, and give a precise
    specification for this case below.

Especially because the data flow is less obvious with sequences, I would expect an exception (=programming error) if the structure of the dynarray changes during iteration.

gasche · March 13, 2023, 9:17pm

I could consider providing both versions if we can find good names for both.

gadmm · March 13, 2023, 9:25pm

My idea is more about discouraging the footgunny one than letting users choose. What usages do you have in mind for the current specification?

c-cube · March 13, 2023, 10:40pm

I fixed the CCVector.to_seq behavior to match to_iter. It’s easier and cleaner to capture the current length and array initially anyway.

threepwood · March 13, 2023, 11:19pm

Ah, smart. I always forget that defining a non-terminating sequence is not necessarily a big deal, you don’t have to read it all.

Ok that makes sense, I did some tests a few days back with some old version, and again today with the latest version, but I didn’t double check to_seq, only iter.

silene · March 14, 2023, 5:53am

I am not sure where you are reading that, but the behavior has always been static, even in the original proposal from 2005: “Notice that the end iterator is only calculated once.” (Proposal for new for-loop)

That is just a lucky accident of existing implementations. The C++ standard is quite clear that this is undefined behavior: “If no reallocation happens, then references, pointers, and iterators before the insertion point remain valid but those at or after the insertion point, including the past-the-end iterator, are invalidated.” You cannot keep using the original value of arr.end(), even if the capacity was reserved.

gasche · October 22, 2023, 9:10am

Hi everyone,

A short note: we eventually manage to build consensus on https://github.com/ocaml/ocaml/pull/11882 and I merged it yesterday, so there will be a Dynarray module in the standard library in OCaml 5.2. Thanks @BastouP411 and @c-cube for their efforts bringing us to this point, and to the many reviewers of the latest iteration of the PR (in particular @314eter, @c-cube, @clef-men, @damiendoligez, @dbuenzli, @gadmm, @Octachron and @wiktor).

The implementation uses an indirection to get a type-safe “empty” value to use in unused elements of the support array. Hopefully we can improve on that with a safe magical implementation in the short future – but I think that the performance is good enough already.

micahcantor · October 22, 2023, 4:44pm

Just want to say thanks everyone for the hard work on getting this merged . Sometimes the “obvious” improvements to a language are the hardest to get consensus on, but I’m glad this got done!

gadmm · October 23, 2023, 6:25pm

Hopefully we can improve on that with a safe magical implementation in the short future – but I think that the performance is good enough already.

I remember the second half being discussed in the thread and I do not think that this conclusion held up. In particular it is very difficult to have an experiment that is relevant wrt cache-locality effects using micro-benchmarks.

gasche · October 24, 2023, 7:26am

We did discuss this, and I ran measurements on the effect of boxing/indirection on a real-world dynarray-intensive program: https://github.com/ocaml/ocaml/pull/11882#issuecomment-1405867624 .

c-cube · October 25, 2023, 12:33am

This was a good experiment. Thank you for your prolonged efforts and for pushing this to the finish line!

giltho · October 31, 2023, 10:13am

Excited to experiment with it! Thank you all for your hard work

I do enjoy the irony that one of the main reason the very first proposal (the one that led to the creation of this discuss thread) was rejected was because of the indirection. (I know that the new decision makes sense for different reasons (OCaml 5 etc), I just find it funny)

Topic		Replies	Views
[ANN] varray 0.2 Ecosystem announce	1	598	November 25, 2023
[ANN] nullable-array 0.1 (initial release) Community announce	7	1841	May 24, 2017
[ANN] containers 3.3 Community announce	1	859	April 4, 2021
Ocaml stdlib and death by a thousand papercuts Ecosystem	89	5709	January 2, 2025
[ANN] vec 0.2.0 Community announce	9	1256	May 25, 2021

Adding Dynamic Arrays / Vectors to StdLib

Related topics