Is reference semantics essentially value semantics?

In OCaml, a ref is still a value.

It’s said that Java has reference semantics. But in:

ArrayList<Integer> numbers = new ArrayList<>();
foo(numbers);

numbers is a value (of type ArrayList<Integer> ref in OCaml terms) nonetheless.

So everything is a value after all?

My understanding is that what people mean when they say that Java has reference semantics is that when you declare a variable of object type, the variable is implicitly a reference/pointer, and there is no way to “dereference” it, as you could do in C via the * operator.

More generally, from the point of view of runtime semantics, you could wonder about the meaning of the assignment operation a = b. In C, depending on the type of the variable, this may mean that a and b now refer to the same object (if the variables are of pointer type), or, if the variables are of structure type, it means that a is a copy of b, different from b (in the sense of !=). In Java, since all variables of object type are implicitly pointers, it always means that a and b now point to the same object.

Note that the semantics of OCaml (and other languages of its family) are closer to Java than to C: allocated objects are implicitly pointers and there is no way to “dereference” them (note that this has nothing to do with ref). Copies of allocated objects need to be made explicitly (as in Java), and are never made implicitly (as in C). As the difference between a reference to an object and a copy of said object is only crucial when you are dealing with mutable data structures, this discussion comes up less often in OCaml which deals mostly with immutable data structures, but it does still come up when dealing with mutable ones, such as ref and array.

Cheers,
Nicolas

3 Likes

Thanks!

But a pointer is just a value. Same for references.

In a = b, a is assigned the value of b, even if they are references.

Yes but in C if you have the following :

int *a;

You have distinction between a and *a. You do not have this distinction in OCaml or Java, even though most values are pointers

In OCaml, I think we have this distinction via ref type. An int or person is a value, while int ref or person ref is a reference (which is still a value). Where person is a product type:

type person = {
  name: string;
  age: int;
}

(* alice is a value *)
let alice = {name = "Alice"; age = 10}

If I had to guess what reference semantics versus value semantics mean, I’d say that for product types, if the natural way to pass a value to a function is to copy the contents then you’re in a value semantics context, whereas if the natural way to pass it is to pass the address then you’re in a reference semantics context.

So if I write something like:

type product = {
  mutable x : int;
  mutable y : int;
}

let f p = p.x <- 1

let () = let p = { x = 0; y = 0 } in f p; print_int p.x

In OCaml, this prints 1, so we’re using reference semantics. In C, if you pass an object of type struct ..., it is copied and the equivalent code will print 0. If you change the type to be a struct ... *, then you’re passing the parameter by reference (i.e. you’re passing a value that is a pointer, which is often called a reference), and there’s no copy involved so you get 1 printed.

1 Like

AFAIK, ref isn’t really part of the OCaml language, just the stdlib. It’s defined as type 'a ref = { mutable contents : 'a }. You can easily redefine it yourself.

type 'a my_ref = { mutable my_contents: 'a }
let ref x = { my_contents: x }
let ( ! ) r = r.my_contents
let ( := ) r x = r.my_contents <- x

Any mutable field can be used as a “ref” basically. The stdlib just defines the ref type for us because it’s almost always useful.

1 Like

In C, if you pass an object of type struct ... , it is copied and the equivalent code will print 0 .

That’s incorrect. Consider the C equivalent of your OCaml code:

struct product {
  int* x;
  int* y;
}

void f(struct product p) {
  *(p.x) = 1;
}

int a = 0, b = 0;
struct product p = { .x=&a, .y=&b };
f(p);

printf("%d", *(p.x)); // prints 1, not 0

I don’t understand why you’re using pointers. In my mind, the C equivalent would be using struct product { int x; int y; }.

1 Like

I think a mutable is essentially a pointer.

There’s a difference, both in OCaml and C, between a mutable field and a pointer (in C, if no const is involved all fields are mutable).

In OCaml:

module type Check_intf = sig
  type t
  val zero : t
  val get : t -> int
  val copy : t -> t
  val update : t -> int -> unit
end

module Mut = struct
  type t = {
    mutable x : int;
  }

  let zero = { x = 0 }
  let get t = t.x
  let copy src = { x = src.x }
  let update t new_x = t.x <- new_x
end

module Ref = struct
  type t = {
    x : int ref;
  }

  let zero = { x = ref 0 }
  let get t = !(t.x)
  let copy src = { x = src.x }
  let update t new_x = t.x := new_x
end

let check msg (module M : Check_intf) =
  let a = M.zero in
  let b = M.copy a in
  M.update a 42;
  Format.printf "%s: %d@." msg (M.get b)

let () =
  check "Ref" (module Ref);
  check "Mut" (module Mut);

This will print the following output:

Ref: 42
Mut: 0

This is because ref cells introduce an extra indirection, so the copies still contain the same reference (in C terms, they’re int * fields) but with the mutable fields, there is no indirection, so the copies don’t share the same location and updating one doesn’t update the other. So they behave like int fields in a C struct.

1 Like

You’re right. Thanks!

type person = {
  mutable age: int;
}

let print_person (p: person) =
  print_int p.age

let update_person (p: person) =
  p.age <- 1000

let alice = {age = 10}

let () =
  update_person alice;
  print_person alice

This example (together with yours) shows that OCaml does have “reference semantics”.

My realization:

In a world of reference semantics (RS), there are two kinds of entities — names and values. Whereas in a world of value semantics (VS), there is only one kind of entities — values.

Examples:

  1. assignment/definition a = b
    RS interpretation: name a is bound to the value that name b is currently bound to
    VS interpretation: value a is assigned to be a copy of value b

  2. function call foo(p)
    RS interpretation: foo works with the value that name p is bound to
    VS interpretation: foo works with a copy of value p