Are there any examples of immutable union-find structure implementations in OCaml ? It is quite easy to find mutable implementations of union-find structures in imperative languages like C or Java, and most of the documentation about this data structure assumes one uses such a language, but I haven’t found immutable implementations in functional languages yet. If you have an idea of something to read about it on the internet, do not hesitate to give me some links (even if it’s written in Haskell or Scala), that would be great !
This article could be interesting: A persistent Union-Find data structure by Jean-Christophe Filliâtre and Sylvain Conchon. It has been proved correct using Coq but the code is written in OCaml.
One approach is to implement a traditional Union-Find data structure, parameterized over an implementation of a store (which provides the types of stores and references and the operations ref, get, set on references). Then, you can instantiate the algorithm with a primitive mutable store, with a persistent store implemented using a map, or with other things (e.g., a store that supports transactions that can be aborted; a store implemented using an array; etc.).
I have done this in OCaml, but the code isn’t part of a published project. If you are interested, I could email it to you, or I could be convinced to spend some time and publish it as a library.
It’s persistent, but it isn’t immutable. While its internal structure contains mutable elements, and it behaves mostly like an immutable union-find, it’s not strictly what the @cloudyhug is asking about. I’ve never seen an efficient implementation of a pure functional union-find. Does it exist?
Sure, there is side effect internally, but from a user point of view it behaves like an immutable date structure, and most of the time this what we want. As long as we don’t use it in a muti-threaded context, the user will not see the difference with a pure date structure: the persistent array implementation is not thread safe due to the reroot operation.
By the way, this is possible to use pure data structure but it seems it is not very efficient: see the last two lines of the performance paragraph.
If I understand correctly @fpottier, they mostly did in the article what he said and choose to use a persistent array module (implemented using side effect) for efficiency reason.