In Imperative Programming - Real World OCaml, in the first “memoization” example, there is the following sentence:
memo_table is referred to by the function, and so won’t be collected until the function returned by
memoize is itself collected.
What does “collected” mean, here ? The (standard ?) meaning I can think of is that from set theory (for instance in the axiom of collection), so this would mean that the pairs (key, value) of the hash table
memo_table are “collected” to be considered as a whole (as a collection) only once the function returned by
memoize is called ? But this does not make much sense…
This is probably the crux to understanding the example:
let memoize f =
let memo_table = Hashtbl.create (module Int) in
(fun x ->
Hashtbl.find_or_add memo_table x ~default:(fun () -> f x));;
val memoize : (int -> 'a) -> int -> 'a = <fun>
I understand what it is intended to do, but I do not understand why it works: it looks like each time
memoize is called, a new (and empty)
memo_table will be created, defeating the purpose of the function ?
To be explicit, if I define
let f x = x + 1;; and then
let g = memoize f;;, then, each time I call
g 1;; and later
g 2;;), the machine will have to compute
(memoize f) 1;; and later
(memoize f) 2;;, so each time there will be a new execution of
Hashtbl.create. Where is the flaw in my reasoning ? (This is probably a recurring beginner’s question, but I haven’t found an answer so far.)
I think in this context it means “garbage collected”, that is that the
Hashtbl will stay in memory until the function that is returned is also garbage collected.
Thanks @Leonidas ! That completely makes sense, and I don’t know why I did not make the connection “collected” with “garbage collected”… There is even a chapter about it in that book, although it is sixteen chapters later
Any idea for the second question ?
Collected means garbage-collected in this context. The underlying question question is for how long the memoized function will remember the association
f x =
y for a value
Concerning the memoization, the crux of your issue seem to be that in
let f x = 1 + x
let g = memoize f
let u = g 0, g 1, g 0
the memoize function is called once, and there is only one table created.
Due to side effects,
g 0 is not equivalent to
(memoize f) 0.
A simpler example of this issue of semantics in presence of side-effect would be
let x = Printf.printf "x\n"; 1
let y = x + x + x
With the eager call-by-value evaluation used by OCam, the string
"x\n" is printed once.
Oh sorry, I sort of overlooked that second part. You observe rightly that calling
memoize will create a new hash table every single time. But the crux is, that
memoize is only called once per function that should be memoized: as you see memoize doesn’t return a value but you pass in a function and it returns another (anonymous) function that wraps your function.
let f x = Int.succ x
let memo_f = memoize f
let fourty_three = memo_f 42
memo_f is not the original function
f but rather the new function that was created inside memoize and it wraps your original
f — it only calls it when it doesn’t know the result yet.
Thanks @octachron, but I am not sure what is different in the
memoize example and in the following example:
let mul x y = print_endline "A"; x * y;;
let mul2 = mul 2;;
This prints “A” the last two times, and does not print it when
mul2 is defined. What’s different compared with your
memoize example ?
Note that the same happens if I define
let mult x y = let () = print_endline "A" in x * y;;
let mult2 = mult 2;;
mult2 4 + mult2 7;;
Then “A” is printed twice after the last command, but not when
mult2 is defined.
let mul x y = print_endline "A"; x * y is just syntactic sugar for
let mul = fun x -> fun y -> print_endline "A"; x * y". That way, when you define
mul2, after one step of beta reduction, it’s as if you have wrote
let mul2 y = print_endline "A"; 2 * y; a function that will print at each call. If you want something similar to memoization, test with:
let mul = fun x -> print_endline "A"; fun y -> x * y
(* compare to your example *)
let mul = fun x -> fun y -> print_endline "A"; x * y