Stack overflow during evaluation (looping recursion?)

I wrote this code. as you can see it does not contain any recursion and 1000000 million is no where close to the max value of int (so as to cause an overflow) but weirdly this code is getting a stack overflow. why?

List.init 1000000 (fun x -> x + 1) |> List.map (fun x -> x |> Lwt.return) |> Lwt.all |> Lwt_main.run;;

Is List.map not recursive ? [checking, List.init is tailrec]

Is List.map not recursive

It seems its not tail recursive List.init 1000000 (fun x -> x + 1) |> List.map( fun x -> x * 2);;
meets the same fate. Stack overflow during evaluation (looping recursion?).

But this is terrible. how does one work with relatively large datasets in ocaml? in today’s day and age mapping over a million item list shouldn’t be a big deal.

  1. IIRC, there are versions of List.map in some of the add-on libraries that tail-recurse after some limit

  2. List.map is trivial to implement in a tail-recursive fashion: anybody who is doing anything real will have done so.

  3. if you’re actually working with large in-memory data-sets you won’t be using lists. [Per Jean-Christophe Filliatre’s wise arguments] You will be using vectors of one sort or another, and they don’t have this problem.

  4. the tail-recursive version of List.map is slower than the stack-y version, b/c the stack-y version performs less allocation. So it’s a trade-off, and for many (actually, most) applications, the stack-y version is better, b/c faster.

I’m sure there are more points, but these are the ones I remembered off the top of my head.

4 Likes

Which version of OCaml are you using? This should not happen with recent versions: OCaml - The “Tail Modulo Constructor” program transformation

Welcome to utop version 2.9.1 (using OCaml version 4.13.1)!

The implementation of map in list.ml in 5.0.0 does not have this attribute. One could of course copy out the code and add the attribute.

1 Like

@Chet_Murthy For OCaml 4, you forgot:

  1. You can increase the stack limit of your operating system.

This is (probably) not necessary with the default size of the OCaml stack in OCaml 5.0.0.
And 5.1.0 will have the tailrecursive-modulo-cons implementation.

But yes, storing a sizeable amount of small objects in a List is not ideal, a simple array is already a better choice.

1 Like

I would even add, if the goal is to transform that first datastructure, storing it, no matter how, is less than ideal. Something iter would be a better choice.

2 Likes

Isn’t this one of the uses for Seq ?

1 Like

Isn’t this one of the uses for Seq ?

But Lwt.all needs a list right? is there any other way to wait on multiple Lwt.t?

Oh wait, you actually have that many Lwt’s. I should have looked at the original post. Hmm, not sure, I’m not into async stuff much, but I’d rethink the whole top level tasking approach. How about, say, splitting the work into chunks of size > 1 and handing those to the Lwt’s ?

What kind of thing do you want to do with the promises?

Does something like this that demos just printing them help?

let seq_to n =
  let step i = if i = n then Lwt.return_none else Lwt.return_some (i, succ i) in
  Lwt_seq.unfold_lwt step 0

let print_nums () =
  Lwt_seq.iter_s (Lwt_io.printlf "%i") (seq_to 1000000)

let () = Lwt_main.run (print_nums ())

I suppose not because it’s still just computing the stream in sequence… sorry for the misdirect.

If at the end of the day, you need a big list, then yes, you’ll have to pay the cost for it. What Seq/Iter will give you is to avoid creating intermediary list you don’t need. (but I’ve tried that example, minus the Lwt.all() life is too short to wait that long, and with ocaml 5 you can map over a list of 1 million items and create the promises.)

As @nobrowser mentionned, a very large collection of short live promises doesn’t sound like a great idea. Chunking could be a solution, maybe we don’t need the full list all at once and something like Lwt_stream would help. It’s difficult to know without any context.

1 Like

Lwt.all is not a primitive function. You can write the array version yourself and it will avoid some roundtrips between arrays and lists because Lwt.all implementation uses an intermediary array.

1 Like

This thread seems related to Tail recursive, efficient ways to dispatch ~1M requests via Lwt which was never fully resolved. Very curious to see if rewriting your own Lwt.all will work (please post if it does).

1 Like