Persistent and ephemeral Seq

I’m having trouble with the Seq module of the Stdlib. If I understand correctly, the (only) difference between f and g below

let f seq (e0, e1) =
  match seq () with
  | Seq.Nil -> e0
  | Seq.Cons (hd, tl) -> e1;;

let g seq (e0, e1) =
  match Seq.uncons seq with
  | None -> e0
  | Some (hd, tl) -> e1;;

is that when calling g seq, the variable seq is (or actually “may be” ?) modified, whereas it is not (it never is) when calling f seq. Is that correct ? If so, apart from very special cases, f should be preferred over g, I guess ?

Actually, it seems that the behavior of several functions depends on whether the argument sequence is persistent or ephemeral (OCaml library : Seq). But it looks like there is no way to know in advance whether a given sequence is persistent or ephemeral ? So, several functions have unspecified behavior ? Even, it says that querying ephemeral sequences “might have” a side effect, and not “do have this and that” side effects. For instance, is it the case that if one does Seq.uncons s, and s is ephemeral, then s is modified to become its tail ? (I tested it on s = Seq.ints 0, but maybe Seq.ints creates persistent sequences: it is not specified.)

There might be something I misunderstand, because the documentation recommends using sequences over dispensers because the latter are always ephemeral, whereas the former can be both. I would think that with dispensers, at least we do know their status, and it’s worse not to know whether something is ephemeral or persistent. I see that the function memoize (resp. once) creates a persistent (resp. ephemeral) sequence, so I should do let s' = Seq.memoize s or let s' = Seq.once s with any sequence I encounter, just to be sure ?

From the definition of Seq.uncons:

Both f and g should behave exactly the same w.r.t the sequence seq (f may have one fewer stack frame, but both f and g, when instantiated with 2 parameters - a seq and a tuple, will evaluate the seq, and execute any side effects it relies on).

Just to be clear, the s itself doesn’t really get modified at all by any of the functions, it is more that s may be relying on side-effects that mean that the results returned by s only make sense if it’s called exactly once.

It’s not guaranteed, and the behaviour will depend on the implementation of the sequence.

Here is an example of a few implementations of a list_to_seq conversion function, one of which is persistent, and the other is ephemeral, and the other is safe to query the head without side-effects:

let rec list_to_seq ls () = match ls with
  | [] -> Seq.Nil
  | h :: t -> Seq.Cons (h, list_to_seq t)

let list_to_seq_eph ls =
  let ls = ref ls in
  let rec loop () =
    match !ls with
    | [] -> Seq.Nil
    | h :: t -> ls := t; Seq.Cons (h, loop) in
  loop
         
let list_to_seq_eph_query_safe ls =
  let ls = ref ls in
  let rec loop () =
    match !ls with
    | [] -> Seq.Nil
    | h :: t ->
      let cont () =
        ls := t;
        loop () in
      Seq.Cons (h, cont) in
  loop
1 Like

Thanks. That was a misunderstanding on my part. The dichotomy persistent/ephemeral is distinct from the dichotomy mutable/immutable. Sequences are an immutable data structure. (Its documentation could state it more clearly. The word “consume” is also misleading, since actually the terms of the sequence are not consumed in the layman sense of the word. Sticking with “query” seems clearer.)

As soon as a structure contains evaluations, you cannot guarantee that someone will not construct such a structure with a “hidden” side effect at some point, e.g.,

let (s: int Seq.t) =
  fun () -> Seq.Cons (0, fun () -> (print_endline "Hello"; Nil));;

Your examples generalize this and are very helpful.

1 Like