Initializing Arrays with References

I discovered this odd(well odd for me) feature while creating array with references.

If I create an array:

let arr = Array.make 81 (ref 0)

and then get and set the first element:

let r = Array.get arr 0
let () = r := 4143

I now set all the references in arr to 4143

If I instead create an array:

let arr2 = Array.init 81 (fun _ -> ref 0)

and then get and set the first element:

let r2 = Array.get arr2 0
let () = r2 := 4143

Now only the first reference is set to 4143…

Why does this behavior exist?

let arr = Array.make 81 (ref 0)

is equivalent to

let r = ref 0 in
let l = 81 in
let arr = Array.make l r

hence there is a single reference that is created and used to initialised all of the 81 indices of the array.

2 Likes

I guess the next question is -> How can you tell if I’m using an array which has one reference for each element or an array with separate references for each element or a mix?

You can’t, short of comparing each element with ==; in fact you may have some of the elements share the reference and not others, eg:

let r1 = ref 0
let r2 = ref 1
let a = [|r1; r1; r2|]

But the point is that it is very unlikely that knowing any of this would be useful in any way.

More generally, you wouldn’t typically use an array of references, but would use the pair (array, index) if you need a “reference” to a specific element of the array, which would allow you to get and set the element as needed.

Cheers,
Nicolas

1 Like

For an existing array you can check for physical equality of all elements:

let is_all_the_same (arr : 'a ref array) : bool =
  Array.length arr = 0 || Array.for_all (fun e -> e == arr.(0)) arr

But in general, you should avoid allocating these arrays. There might be corner cases where it is a desirable property, but that would be the exception to the rule. And as @nojb mentioned, you can just keep the index of an element and use it to get/set the array value directly. Arrays are mutable.

It would be nice to have a compiler warning for “you are using an impure expression (ref …) where a value is generally expected”. There aren’t too many places where this pattern would apply, but for the cases where it does it’s very much a gotcha!

I have a feeling that there is a misunderstanding lurking here, apologies if I’m wrong. An array of references is represented by a memory block where each element contains a pointer to another memory block which then contains the contents of the ref.

1 Like

I’m wondering why create an array of references of something? Arrays themselves are mutable, so you can set each index of the array directly, without using a reference.

Going further, one might say that

  1. one of the things about functional languages (and mostly-functional languages) is that you don’t worry about memory-representations and pointer-equality ("=="): structural equality ("=") is enough.

  2. but this fails when you get to mutable data. The minute you start using mutable fields in records, or arrays, or refs (which are literally a record with a single mutable field), you MUST know about and reason about the pointer-equality-structure of your data.

So a different answer to "why is Array.make 81 (ref 0) different from Array.init 81 (fun _ -> ref 0), is that when you carefully understand the pointer-block-graph constructed by the two bits of code, you see that they’re different.

It’s why we try hard not to use mutable data-structures, unless semantically or performance-wise necessary: we all prefer not to have to understand those low-level details.

2 Likes