Note that memo_table is referred to by the function, and so won’t be collected until the function returned by memoize is itself collected.
What does “collected” mean, here ? The (standard ?) meaning I can think of is that from set theory (for instance in the axiom of collection), so this would mean that the pairs (key, value) of the hash table memo_table are “collected” to be considered as a whole (as a collection) only once the function returned by memoize is called ? But this does not make much sense…
This is probably the crux to understanding the example:
let memoize f =
let memo_table = Hashtbl.create (module Int) in
(fun x ->
Hashtbl.find_or_add memo_table x ~default:(fun () -> f x));;
val memoize : (int -> 'a) -> int -> 'a = <fun>
I understand what it is intended to do, but I do not understand why it works: it looks like each time memoize is called, a new (and empty) memo_table will be created, defeating the purpose of the function ?
To be explicit, if I define let f x = x + 1;; and then let g = memoize f;;, then, each time I call g (e.g., g 1;; and later g 2;;), the machine will have to compute (memoize f) 1;; and later (memoize f) 2;;, so each time there will be a new execution of Hashtbl.create. Where is the flaw in my reasoning ? (This is probably a recurring beginner’s question, but I haven’t found an answer so far.)
I think in this context it means “garbage collected”, that is that the Hashtbl will stay in memory until the function that is returned is also garbage collected.
Thanks @Leonidas ! That completely makes sense, and I don’t know why I did not make the connection “collected” with “garbage collected”… There is even a chapter about it in that book, although it is sixteen chapters later
Any idea for the second question ?
Collected means garbage-collected in this context. The underlying question question is for how long the memoized function will remember the association f x = y for a value x.
Concerning the memoization, the crux of your issue seem to be that in
let f x = 1 + x
let g = memoize f
let u = g 0, g 1, g 0
the memoize function is called once, and there is only one table created.
Due to side effects, g 0 is not equivalent to (memoize f) 0.
A simpler example of this issue of semantics in presence of side-effect would be
let x = Printf.printf "x\n"; 1
let y = x + x + x
With the eager call-by-value evaluation used by OCam, the string "x\n" is printed once.
Oh sorry, I sort of overlooked that second part. You observe rightly that calling memoize will create a new hash table every single time. But the crux is, that memoize is only called once per function that should be memoized: as you see memoize doesn’t return a value but you pass in a function and it returns another (anonymous) function that wraps your function.
let f x = Int.succ x
let memo_f = memoize f
let fourty_three = memo_f 42
memo_f is not the original function f but rather the new function that was created inside memoize and it wraps your original f — it only calls it when it doesn’t know the result yet.
That’s because let mul x y = print_endline "A"; x * y is just syntactic sugar for let mul = fun x -> fun y -> print_endline "A"; x * y". That way, when you define mul2, after one step of beta reduction, it’s as if you have wrote let mul2 y = print_endline "A"; 2 * y; a function that will print at each call. If you want something similar to memoization, test with:
let mul = fun x -> print_endline "A"; fun y -> x * y
(* compare to your example *)
let mul = fun x -> fun y -> print_endline "A"; x * y