On a whim, today I started thinking about how I would write stream-parsers for Seq.t. That is, suppose that instead of camlp-streams’ Stream.t, we had Seq.t. And right off the bat, there’s a problem. Maybe somebody here who understands Seq.t better can tell me how to resolve these issues.
(1) So: (ignoring one particular operation which isn’t used very often, and could be forbidden[1]), Stream.t is persistent for its first element: you can fetch it as many times as you want. But eventually, when you finish with it, you can discard it and move on. But it’s not as if the entire sequence is persistent: if I have two variables that point at the same sequence, IIRC “persistent” means that each can consume the sequence independently, and they don’t see each other’s consumption. That wouldn’t be the case with Stream.t. I don’t know how to get this behaviour from Seq.t
But maybe we should go with persistent sequences then?
(2) Which leads to the second problem: is there a way to construct sequences – a collection of construction operations – that guarantees that all sequences thus-constructed are persistent? B/c otherwise, it seems like you’d need to be wrapping sequences in memoize all the time, on the off-chance that what you’d been given wasn’t persistent.
I hope I’ve expressed these questions coherently, and if anybody has any answers I’d appreciate 'em.
[1] the one operation is “npeek”, which allows to peek at the first N values of the stream. It’s kind of like the take operation. And it doesn’t destroy those values in the stream, hence if you allow this operation, you’re stipulating arbitrary persistence.
P.S. The reason that Stream.t provides depth-one persistence, is that that’s a useful abstraction for implementing LL(1) parsers. You can peek at the input stream and use that to make a decision about which production to parse down; then that production will consume tokens, but doesn’t have to assume that the first token has already been consumed. It’s a convenient way to structure things.
So there’s two parts to streams: parsers, and construction. This is about construction.
Two examples:
[< 'e1 ; e2 >] : this is -about- the same as “Seq.cons”: e1 is of type 'a and e2 of type 'a Stream.t. But two things:
(a) e2 isn’t evaluated, as it would be with Seq.cons e1 e2.
(b) more importantly, how do we know that e2 is persistent? And if we don’t, then do we need to memoize this expression too? And when the resulting expression is combined with another stream, e.g. in:
[< e1 ; e2 >] – which is kind of like Seq.append – does -that- need to be wrapped in a memoize also? What is the efficiency of memoize(memoize s) as compared to memoize s ?
ETA: In case it’s not obvious, my goal here is -not- to implement my own sequence-like type (I have one of those – Stream.t from camlp-streams) but rather to use Seq.t to implement stream-parsers and expressions in the style that Camlp5 has, and that OCaml used to have decades ago. That is to say, to allow writing stream-parsers (AKA lightweight, composable LL(1) parsers) on Seq.t as-is, not on some type that would be concocted on top of Seq.t, or some type that looks kind of like Seq.t.
What exactly is your concern with Seq.cons in this case? Note that Seq.t is just an alias for a unit-arg function which itself returns the inner node type. So as the implementor of a Seq.t, you get to choose exactly what is evaluated and when’d. More concretely, e2 in your example would be a function that returns the tail sequence when passed to a Cons value constructor. I’m not sure exactly what the implications would be of double-memoizing, but you can always create your own memoized list by wrapping the value in a Lazy.t.
You may find this thread interesting. In particular, the pattern recommended by @silene. This appears to be a robust construction in my testing and it also allows you to use combinators efficiently on top of the underlying “manual” sequences as long as you’re doing O(1) work per combinator operation. In other words, you don’t need to repeatedly call Seq.memoize (and in fact likely don’t want to use that anywhere).