Hello!
TL;DR: my attempt at optimizing a program using Seq instead of List ended up using 3 to 4 times the memory, and I don’t understand why…
I’m currently trying to optimize some code that seems to eat a lot of memory for some reason.
I noticed that code has several List.map and List.filters done in a row on the same list. I thought I’d use Seq instead at that point, but I’m not used to really think about optimization so I thought I’d do some tests on my own.
I’m very surprised about the results.
Here’s is the test program :
let () = Random.self_init ()
module WITH_SEQ = struct
let of_list = List.to_seq
let ( >| ) o f = Seq.map f o
let to_list = List.of_seq
end
module WITH_LIST = struct
let of_list x = x
let ( >| ) o f = List.rev_map f o
let to_list x = x
end
open WITH_LIST
(* open WITH_SEQ *)
let higher_bound = 10000
let size = 1000000
let make size = of_list @@ List.init size (fun _ -> Random.int higher_bound)
let f1 a = a + Random.int 12134342
let f2 a = a * Random.int 23
let f3 a = a - Random.int 2343432
let do_ () =
let l = make size >| f1 >| f2 >| f3 |> to_list in
List.length l |> string_of_int |> print_endline
let main () =
Memtrace.trace_if_requested ();
for _ = 0 to 20 do
do_ ()
done
let () = main ()
The program can be simply executed choosing which of WITH_LIST
or WITH_SEQ
you leave uncommented.
I then used memtrace
and memtrace-viewer
to take a look at memory usage :
- with lists, the program used 2.29GB memory (if I’m reading it properly), most of it being allocated by
rev_map
as expected - with seq, the program used 7.43GB memory ! Most of it being allocated by seq.map.
Here are screenshots in case I’m reading wrong
I’m very surprised as it seems to be the opposite of what this article from ocamlverse about iterator says :
The classic example is something like
let l1 = [1; 2; 3] |> List.map (fun x -> x + 1) |> List.map (fun x -> x * 3)
In this example, each stage of the computation needs to be materialized, causing allocation. Instead, we can do
let l1 = List.to_seq [1; 2; 3] |> Seq.map (fun x -> x + 1) |> Seq.map (fun x -> x * 3) |> List.of_seq
This version will not allocate the intermediate steps.
I’d be very interested in understanding what I did wrong there
Thanks !