Hi all!
It is my pleasure to announce the first public release of streaming
– a library for building efficient, incremental data processing pipelines that compose and don’t leak resources.
I built streaming as a result of many experiments with different streaming and iteration models for OCaml. There are multiple packages on OPAM that share some of the goals of streaming
(we even have Stdlib.Seq
now!), but none of them combine (1) excellent performance, (2) safe resource handling and (3) pure functional style for combinators. Streaming solves these problems by implementing three basic and independent models: sources, sinks and flows – they represents different parts of the pipeline that correspond to producing, consuming and transforming elements. These models can be defined and composed independently to produce reusable “streaming blocks”.
The library defines a central Stream
model that relies on sources, sinks and flows. This model is a push-based iterator with performance characteristics similar to the iter
iterator, which has type ('a -> unit) -> unit
, and is known for being very efficient. But unlike iter
, it has a pure functional core (no need to use mutable state and exceptions for flow control!) and can handle resource allocation and clean up in a lazy and deterministic way. All of this while having a slightly better performance for common stream operations.
For those who are curious about the performance characteristics of streaming
and other models, I created a dedicated repository for stream benchmarks: https://github.com/rizo/streams-bench. In particular, it includes a few simple benchmarks for Gen
, Base.Sequence
, Stdlib.Seq
, Iter
, Streaming.Stream
and Streaming.Source
.
The library should soon be published on opam. In the meantime, I invite you to read the docs and explore the code:
- Library documentation: https://odis-labs.github.io/streaming
- Github project: https://github.com/odis-labs/streaming
Questions, opinions and suggestions are welcome!
Happy streaming!