Seq vs List, optimization

giltho · November 30, 2020, 9:24pm

Hello!

TL;DR: my attempt at optimizing a program using Seq instead of List ended up using 3 to 4 times the memory, and I don’t understand why…

I’m currently trying to optimize some code that seems to eat a lot of memory for some reason.
I noticed that code has several List.map and List.filters done in a row on the same list. I thought I’d use Seq instead at that point, but I’m not used to really think about optimization so I thought I’d do some tests on my own.

I’m very surprised about the results.
Here’s is the test program :

let () = Random.self_init ()

module WITH_SEQ = struct
  let of_list = List.to_seq

  let ( >| ) o f = Seq.map f o

  let to_list = List.of_seq
end

module WITH_LIST = struct
  let of_list x = x

  let ( >| ) o f = List.rev_map f o

  let to_list x = x
end

open WITH_LIST

(* open WITH_SEQ *)

let higher_bound = 10000

let size = 1000000

let make size = of_list @@ List.init size (fun _ -> Random.int higher_bound)

let f1 a = a + Random.int 12134342

let f2 a = a * Random.int 23

let f3 a = a - Random.int 2343432

let do_ () =
  let l = make size >| f1 >| f2 >| f3 |> to_list in
  List.length l |> string_of_int |> print_endline

let main () =
  Memtrace.trace_if_requested ();
  for _ = 0 to 20 do
    do_ ()
  done

let () = main ()

The program can be simply executed choosing which of WITH_LIST or WITH_SEQ you leave uncommented.
I then used memtrace and memtrace-viewer to take a look at memory usage :

with lists, the program used 2.29GB memory (if I’m reading it properly), most of it being allocated by rev_map as expected
with seq, the program used 7.43GB memory ! Most of it being allocated by seq.map.
Here are screenshots in case I’m reading wrong

image3576×1558 487 KB

image3584×1630 468 KB

I’m very surprised as it seems to be the opposite of what this article from ocamlverse about iterator says :

The classic example is something like
let l1 = [1; 2; 3]
  |> List.map (fun x -> x + 1)
  |> List.map (fun x -> x * 3)
In this example, each stage of the computation needs to be materialized, causing allocation. Instead, we can do
let l1 = List.to_seq [1; 2; 3]
 |> Seq.map (fun x -> x + 1)
 |> Seq.map (fun x -> x * 3)
 |> List.of_seq
This version will not allocate the intermediate steps.

I’d be very interested in understanding what I did wrong there
Thanks !

giltho · November 30, 2020, 9:31pm

Followup question, the article says also says that Iter is more efficient than Seq, so I’m curious why Iter isn’t part of stdlib while seq is :

pveber · November 30, 2020, 9:35pm

Have you used a compiler switch with flambda activated?

giltho · November 30, 2020, 9:38pm

Sorry forgot to say.
The compiler is ocaml-base-compiler.4.11.1 (so no flamba)

pveber · November 30, 2020, 9:38pm

I suspect this is because Seq is more expressive, you may find more details in this article by @gasche.

pveber · November 30, 2020, 9:40pm

Then you should definitely have a try, it may make a big difference for that kind of code.

SkySkimmer · November 30, 2020, 9:45pm

Isn’t that the wrong order?

giltho · November 30, 2020, 9:54pm

@pveber flambda would definitely be an improvement, but Seq should allocate more than Lists in any case ?
@SkySkimmer, I think the order is correct, I’m using it like the pipe operator

SkySkimmer · November 30, 2020, 10:06pm

I mean shouldn’t it be non-rev? Seq.map preserves the order AFAIK
Of course that means you need to decide what tailrec List.map to use

giltho · November 30, 2020, 10:12pm

Ah sorry. Yes, the results of both tests are reversed. If you reduce size to something that List.map can work with like 200000, and use List.map instead of List.rev_map, results are the same : seq uses 1.48GB, list uses 450M

giltho · November 30, 2020, 10:16pm

Update about flambda :
Using the example with List.map instead of List.rev_map, size = 200000 and flambda the results are not very different :

1.36GB for seq (1.48GB without flambda)
435M for list (450M without flambda)

yawaramin · November 30, 2020, 10:27pm

Memory usage for a lazy data type like Seq.t shouldn’t be affected by flambda–I think what’s happening here is that you’re loading the entire seq into memory immediately because you’re building it from a list. And a fully-loaded seq actually takes more memory than its equivalent list.

EDIT: when you build the seq you need to use Seq.unfold, not List.init.

giltho · November 30, 2020, 10:33pm

Hmmm… I understand that Seq.unfold would be definitely more efficient than List.init. However, in both cases the list is allocated fully. Then I would expect Seq.map to not reallocate the intermediate steps.
I’m in the exact case described in the article… so if that is true I’m guessing the article is also wrong ?

dbuenzli · November 30, 2020, 10:36pm

I’m too lazy to work out the details but maybe if you do it won’t be so surprising. In lists you just have one cons cell per list element, in seq you get a cons cell + a closure, so you can expect more will be allocated by processing with Seq.

giltho · November 30, 2020, 10:41pm

ok, yeah, I can definitely see that being true…But then what’s the point of Seq ? There’s 3 times the allocation so it uses more memory and also probably takes more time (although on my example it seems to take more or less the same time)…

dbuenzli · November 30, 2020, 10:47pm

Don’t ask me I was against its inclusion (but for other reasons).

More seriously the point versus lists is that if your data is streamable, the whole program run may allocate more but at every given point in time your program may have a smaller memory footprint since not all the data has to be in memory.

giltho · November 30, 2020, 10:48pm

Ahhh, then that makes it clear
Thank you very much for your answers !

Drup · November 30, 2020, 11:46pm

Alternatively, you could avoid doing roundtrip to lists and write your code as you naturally would:

module WITH_SEQ = struct
  let init n f = OSeq.init ~n f (* taken from Oseq because I'm lazy, the code is trivial *)
  let ( >| ) o f = Seq.map f o
  let sum = OSeq.fold_left (+) 0
end

module WITH_LIST = struct
  let init n f = List.init n f
  let ( >| ) o f = List.rev_map f o
  let sum = List.fold_left (+) 0
end

And then, :magic: , a fabulous allocation rate that never get promoted because we never need to actually build the full lists, and a runtime divided by around 3 … tada it’s the number of map in your code

Trace file list.mem (5.6 kB) [....]
Trace statistics:
   Minor: 279 MB/s allocations, 98.3% promoted
   Major: 274 MB/s allocations (  0 B/s direct), 270 MB/s collections
   Sizes (by word): 100% <= 4w, 100% <= 16w, 100% <= 256w. Max:  16 B
   Runtime: 8.216 s

Trace file seq.mem ( 10 kB) [....]
Trace statistics:
   Minor: 2.3 GB/s allocations, 0.0% promoted
   Major:   0 B/s allocations (  0 B/s direct),   0 B/s collections
   Sizes (by word): 49% <= 4w, 100% <= 16w, 100% <= 256w. Max:  40 B
   Runtime: 2.279 s

As @dbuenzli pointed out, Seq's point is to stream. Doing back-and-forth between lists makes your code more convoluted, slower, and consume more memory.

Also, for fun, gen is an imperative enumerator, it allocates less, but it’s rather painful to manipulate, and doesn’t really go much faster in practice most of the time.

Trace file gen.mem (2.8 kB)
Trace statistics:
   Minor: 607 MB/s allocations, 0.0% promoted
   Major:   0 B/s allocations (  0 B/s direct),   0 B/s collections
   Sizes (by word): 100% <= 4w, 100% <= 16w, 100% <= 256w. Max:   8 B
   Runtime: 2.123 s

And iter:

Trace file iter.mem (242 B) [...]
Trace statistics:
   Minor: -nan TB/s allocations, -nan% promoted
   Major: -nan TB/s allocations (-nan TB/s direct), -nan TB/s collections
   Sizes (by word): -nan% <= 4w, -nan% <= 16w, -nan% <= 256w. Max:   0 B
   Runtime: 0.000 s

Iter is basically a for loop, it barely allocates, especially on such a trivial benchmark. The actual runtime is 2s here. As highlighted before, the downside of Iter is that you can’t write combinators like zip with it, making it much more limited.

In conclusion, don’t include data-structure roundtrips in your benchmarks measurements, and try to write producer and consumers that manipulates your streaming datastructure directly when possible.

Conclusion²: iter smokes everything else when you can use it.

(and sorry, about the emoji mood)

giltho · December 1, 2020, 12:08am

I’m sorry if I came off aggressive/ungrateful/… in my questions that was really not the point
My feeling was really “I don’t understand what is happening, it goes against what I read earlier”.

My benchmark was really what I wanted to do : List -> Seq -> List.
The reason is that my bit of code manipulates list and I can’t change it without a very big effort that I don’t have time to put it right now. My questions were about the precise use-case described by the ocamlverse article in question, which seem to be wrong.
I’ve submitted an issue on the ocamlverse repo and will probably write a PR to fix this article soon.

Drup · December 1, 2020, 12:20am

Right, but then you are trying to do something that is fundamentally not really efficient. Rewriting code using List to use Seq instead is generally quite simple (Use Nil/Cons appropriately and insert some thunks here and there) and could get you the after-mentioned benefits.

If you just want to fuse some filter/map together, even to/from lists, use iter and it’ll work fine.

Topic		Replies	Views
Recommended patterns and techniques for Seqs (especially when memoized) Learning lazy , seq , memoization	13	427	January 27, 2025
Module 'Stream' removed from 5.0 standard library? Ecosystem	62	3854	August 3, 2022
A new List.map that is both stack-safe and fast Community article , performance	34	12692	February 24, 2018
Sequences: two schools of thought? Learning	30	1363	March 9, 2024
Why don't we use lazy lists in OCaml? Learning	9	572	January 24, 2025

Seq vs List, optimization

Related topics