Type for mutable sets

user1 · May 31, 2022, 1:41pm

It looks like the Stdlib has no data structure for (mutable) sets ? Indeed, OCaml library : Set.Make represents (immutable) totally ordered sets.

Can I simply define

type 'a set = ('a, unit) Hashtbl.t

(see OCaml library : Hashtbl) and be content with it ? Or are there better ways ?

vlaviron · May 31, 2022, 2:50pm

It depends a bit on why you need mutable sets. The Hashtbl option is perfectly fine, but the other obvious solution is to use Set.Make to create immutable sets and manipulate references to sets for mutability.

user1 · May 31, 2022, 3:11pm

Thanks. Set.make creates totally ordered sets, and there is no natural total order on the things I want to collect in a set. (Of course, I could put one, but this would feel arbitrary and superfluous.)

As for the general trick of “emulating” a mutable data structure by taking a reference to its immutable counterpart, it seems that it would hamper performance ? I mean: the performance advantage of mutability is that you do not have to recreate (reallocate?) a new structure each time you mutate it, but if you mutate a reference to an immutable data structure, you have to create a new contents field anyway, so this amounts to recreating an object of type the immutable data structure you started with, and nothing is gained ?

jjb · May 31, 2022, 3:28pm

Relative performance will depend on which operations you use often. Sets represented as hash tables are fine if your hash function is fast and you don’t need the binary operations like union. For binary operations, the immutable representation is expected to allocate less in most cases.

vlaviron · May 31, 2022, 3:53pm

Not always. Keep in mind that a big advantage of immutable data structures is that they can use sharing. Typically, adding an element to a set isn’t much more expensive than adding an element to a list, regardless of the size of the set. This difference could also become quite important if you need backtracking: with a typical mutable structure (such as Hashtbl), if you want to save the current state for later you need a deep copy. If you use a reference instead this can be done by simply storing the current value of the reference, which is more or less free.

The argument about not having to reallocate is usually only true for big records, as my_record.f <- new_value can indeed be much less expensive than { my_record with f = new_value }. But sets are almost always going to be implemented by lots of small cells linked together (whether using mutable or immutable structures), so adding an extra reference cell on top of the structure doesn’t change much.

Topic		Replies	Views
Best practice to build adjacency list Learning	13	1062	May 13, 2022
Immutable Union-Find Structure Learning datastructures	4	2207	November 7, 2018
Is a mutable project structure inherently slower? Learning	8	904	August 15, 2023
How to keep inner mutable data synchronized Learning	1	366	January 21, 2024
Mutable sorted data structure Learning	8	1673	June 29, 2019

Type for mutable sets

Related topics