When to use list, and when to use Seq.t?

Continuing the discussion from Seq vs List, optimization:

The above discussion is some years old and discusses a specific case. I wonder about list vs Seq.t on a more general level (and not sure if that old thread still reflects current OCaml with respect to performance).

When writing functions that process an ordered β€œlist” of elements, I sometimes feel unsure whether I should use list or Seq.t (see Seq in the manual).

I understand that a list will reside completely in memory while a sequence may be lazy. So when I don’t want to keep all elements in memory (or if they haven’t even been generated by the time I start processing the first element), I will have to use Seq.t or some similar structure or mechanism.

Is it correct that OCaml distinguishes between list and Seq.t because it is not lazy by default? In Haskell, for example, both cases are covered by the same data type, right? In OCaml I have to make that decision myself, right?

I experimented a bit with speed by creating a list/sequence of 10,000 elements and consuming them. I have looked at three cases:

  • A list where I consume the elements in the same order as they have been created
    • by reversing it explicitly using List.rev after adding all the elements,
    • by using non-tail-call recursion (accepting stack growth),
    • by using the Tail Modulo Constructor transformation.
  • A list where process the elements in reverse order of their creation (i.e. not reversing it).
  • A sequence.

This is the code I ran:

let consume_list : int list -> int =
  let rec aux acc = function
    | [] -> acc
    | x :: xs -> aux (acc + x) xs
  in
  aux 0

let consume_seq : int Seq.t -> int =
  let rec aux acc seq = match Seq.uncons (seq) with
    | None -> acc
    | Some (x, xs) -> aux (acc + x) xs
  in
  aux 0

let provide_list_rev : unit -> int list =
  let rec aux acc = function
    | i when i < 10000 -> aux (i :: acc) (i + 1)
    | _ -> List.rev acc
  in
  fun () -> aux [] 0

let provide_list_no_tail_call : unit -> int list =
  let rec aux = function
    | i when i < 10000 -> i :: aux (i + 1)
    | _ -> []
  in
  fun () -> aux 0

let provide_list_tail_mod_cons : unit -> int list =
  let[@tail_mod_cons] rec aux = function
    | i when i < 10000 -> i :: aux (i + 1)
    | _ -> []
  in
  fun () -> aux 0

let provide_list_lifo : unit -> int list =
  let rec aux acc = function
    | i when i < 10000 -> aux (i :: acc) (i + 1)
    | _ -> acc
  in
  fun () -> aux [] 0

let provide_seq : unit -> int Seq.t =
  let rec aux i () = match i with
    | i when i < 10000 -> Seq.Cons (i, aux (i + 1))
    | _ -> Seq.Nil
  in
  fun () -> aux 0

let () =
  let args = ref [] in
  Arg.parse [] (fun arg -> args := arg::!args) "";
  let func = match !args with
  | ["1"] -> fun () -> ignore @@ consume_list @@ provide_list_rev ()
  | ["2"] -> fun () -> ignore @@ consume_list @@ provide_list_no_tail_call ()
  | ["3"] -> fun () -> ignore @@ consume_list @@ provide_list_tail_mod_cons ()
  | ["4"] -> fun () -> ignore @@ consume_list @@ provide_list_lifo ()
  | ["5"] -> fun () -> ignore @@ consume_seq @@ provide_seq ()
  | _ -> failwith "Unexpected command line argument"
  in
  for i = 1 to 100000 do
    func ()
  done

Guess what is fastest on my machine, using ocamlopt -O2 and taking the median of three runs each? (click below to uncover)

  • List and consume in same order as creation
    • Case #1 using List.rev: 7.59 seconds
    • Case #2 using non-tail-call recursion: 5.53 seconds
    • Case #3 using tail-modulo-cons: 6.75 seconds
  • Case #4 using list in last-in-first-out order: 3.28 seconds
  • Case #5 using a sequence: 4.45 seconds

Of course, these results may be non-representative in real-world scenarios and may also depend on data-size etc. And I didn’t use a sophisticated benchmarking framework, so don’t take these numbers too seriously.

But interestingly I see no clear performance advantage of list over Seq.t, unless the order of processing the created elements doesn’t matter or I can create or process the list in reverse order, such that List.rev or a similar mechanism becomes unnecessary.

Does that mean I should use Seq.t more often in cases where I don’t need a persistent list?

I have also been wondering about this:

Are there some downsides of Seq that I should know about, @dbuenzli? My rough benchmark above makes me feel like Seq.t is a good alternative to list in many cases (even though the syntax is sometimes a bit more confusing, using Seq.Cons and Seq.uncons).

The downside is that most programmers eventually end up using it wrongly (e.g. lazy IO) or when it’s not necessary and thus creating an unordinate amount of garbage for the GC to collect (as mentioned in the discussion you linked to).

1 Like

And I didn’t use a sophisticated benchmarking framework, so don’t take these numbers too seriously.

Taking your existing code and only replacing the last bit with

let () =
  let open Core in
  let open Core_bench in
  let b f g () =
    for i = 1 to 100_000 do
      ignore (f (g ()))
    done
  in
  Command_unix.run
    (Bench.make_command
       [
         Bench.Test.create ~name:"1" (b consume_list provide_list_rev);
         Bench.Test.create ~name:"2" (b consume_list provide_list_no_tail_call);
         Bench.Test.create ~name:"3" (b consume_list provide_list_tail_mod_cons);
         Bench.Test.create ~name:"4" (b consume_list provide_list_lifo);
         Bench.Test.create ~name:"5" (b consume_seq provide_seq);
       ])

which depends on core and core_bench, you can get a result like the following:

$ dune exec bin/main.exe -- time cycles alloc gc percentage speedup samples
Estimated testing time 50s (5 benchmarks x 10s). Change using '-quota'.
β”Œβ”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Name β”‚ Runs @ Samples β”‚ Time/Run β”‚ Cycls/Run β”‚ mWd/Run β”‚        mjWd/Run β”‚        Prom/Run β”‚ mGC/Run β”‚        mjGC/Run β”‚ Percentage β”‚ Speedup β”‚
β”œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1    β”‚          3 @ 3 β”‚    5.54s β”‚   18.24Gc β”‚  6.00Gw β”‚ 538_317_592.50w β”‚ 538_317_592.50w β”‚  23.98k β”‚ 2_180_400.00e-3 β”‚    100.00% β”‚    2.20 β”‚
β”‚ 2    β”‚          3 @ 3 β”‚    3.75s β”‚   12.35Gc β”‚  3.00Gw β”‚ 181_005_775.50w β”‚ 181_005_775.50w β”‚  11.83k β”‚   764_800.00e-3 β”‚     67.71% β”‚    1.49 β”‚
β”‚ 3    β”‚          3 @ 3 β”‚    4.40s β”‚   14.49Gc β”‚  3.00Gw β”‚ 366_637_137.00w β”‚ 366_637_137.00w β”‚  12.22k β”‚ 1_557_200.00e-3 β”‚     79.42% β”‚    1.75 β”‚
β”‚ 4    β”‚          4 @ 4 β”‚    2.52s β”‚    8.30Gc β”‚  3.00Gw β”‚ 181_187_071.60w β”‚ 181_187_071.60w β”‚  11.83k β”‚   764_071.43e-3 β”‚     45.49% β”‚    1.00 β”‚
β”‚ 5    β”‚          3 @ 3 β”‚    4.02s β”‚   13.25Gc β”‚ 12.00Gw β”‚          18.00w β”‚          18.00w β”‚  45.78k β”‚       200.00e-3 β”‚     72.64% β”‚    1.60 β”‚
β””β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

1 Like

I wondered when this happens because in my previous example, Seq.t seemed competitive. But here is an example (I assume), where this garbage issue causes a slower runtime in the end:

let consume_list : int list -> int =
  let rec aux acc = function
    | [] -> acc
    | x :: xs -> aux (acc + x) xs
  in
  aux 0

let consume_seq : int Seq.t -> int =
  let rec aux acc seq = match Seq.uncons (seq) with
    | None -> acc
    | Some (x, xs) -> aux (acc + x) xs
  in
  aux 0

let () =
  let make_list size =
    let rec aux acc = function
      | n when n < size -> aux (n :: acc) (n + 1)
      | _ -> List.rev acc
    in
    aux [] 0
  in
  let list = make_list 10000 in
  for i = 1 to 100000 do
    ignore @@ consume_list list
  done;

with an execution time of 0.95 seconds on my machine.

Compare with:

-  let list = make_list 10000 in
+  let seq = List.to_seq (make_list 10000) in
   for i = 1 to 100000 do
-    ignore @@ consume_list list
+    ignore @@ consume_seq seq
   done;

resulting in 4.48 seconds on my machine, hence Seq.t being 4.7Γ— slower. :scream:

Which package do I need for that? core? When I try to install core through opam, I get:

$ opam install core
[ERROR] Package conflict!
  * No agreement on the version of ocaml-base-compiler:
    - (invariant) β†’ ocaml-base-compiler = 5.4.0
    - core β†’ ocaml < 5.1~ β†’ ocaml-base-compiler < 5.0.1~
    You can temporarily relax the switch invariant with `--update-invariant'
…

Which package do I need for that? core?

yes, and core_bench. It works for me on x86_64 linux and OCaml 5.4.1, but core and especially core_unix have some portability issues.