What youâre comparing here is two different language semantics, and these semantics have to be tied to data structure representation for performance reasons.
Python is generally a reference-semantics language, as are most popular languages, including Java and Ruby. This means that when we run
a = range(3)
b = a
b is just a reference to a ie. theyâre pointing to the same thing in memory. The same thing is true for
def foo(a):
a += [1]
b = range(3)
foo(b)
b will be modified directly in this case, because weâre passing a reference to b to the function foo()
. OCaml and other functional languages tend to have value semantics, at least as far as immutable values are concerned, meaning that in
let a = [1; 2; 3]
let b = 0::a
b theoretically refers to an unrelated value than a. However, this is an illusion â b is really pointing to a new 0 element of the list, and that element then points to the elements of a. This is a useful illusion to be sure, because itâs really nice that we can pretend not to worry about sharing data between a and b, but this illusion breaks down as soon as we make things mutable:
let a = [ref 1; ref 2; ref 3] (* list of mutable references *)
let b = (ref 0)::a
(List.nth b 2) := 10 (* modifies a and b *)
So much for our pretend value semantics. So long as we avoid mutability though, we get to keep this nice notion of value semantics, allowing us to ignore the fact that thereâs sharing going on âbehind the scenesâ. This is really nice, because references to mutable data structures cause a lot of complexity and bugs.
Notice, btw, that your example,
let a = [0; 1; 2]
let b = a
let b = b @ [3] (* copies the whole list *)
isnât exactly a fair comparison. The semantics of immutable lists mean that adding to the end of the list involves rebuilding the whole list. This is like doing
a = range(3)
b = a[:] (* copy the whole python 'list' *)
b += [3] (* a is unmodified *)
ie. itâs quite inefficient, since itâs copying the whole data structure. In fact, efficiency is at the heart of a lot of this. Value semantics (having data structures be unrelated to each other) is wonderful in theory, but with most data structures, this involves heavy copying, which is why Python doesnât do it. Pythonâs âlistâ is a misnomer â
itâs really just a vector â a resizable, mutable array. OCamlâs lists are immutable by default and they use a linked list data structure, which means that it makes sense to share them when possible, pretending that theyâre not just sharing values in memory. Notice that OCamlâs array canât pull off this trick:
let a = [|1; 2; 3|]
let b = a (* points to a in memory *)
b.(2) <- 10 (* affects a and b *)
Again, due to mutability, we have to do the same sharing that python does. OCamlâs arrays donât even have a reasonable method of expansion, since theyâre not vectors: they donât resize, and we have to create a new array every time we want to add to them:
let a = [|1; 2; 3|]
let b = a (* points to a in memory *)
let b = Array.append b [|4|]
(* create a new array. No sharing, but inefficient *)
All of which is to say, different languages have different semantics, and those semantics are there for a reason. Python has reference semantics for lists and objects. OCaml has pretend-value semantics for immutable values, which is really cool (but has its own interesting limitations), but as soon as youâre dealing with mutable values, the pretense breaks down and youâre back in the land of reference semantics (for the sake of efficiency).