I use angstrom extensively, so I’ll definitely be taking a look at reparse. It’s a pity you weren’t aware of angstrom prior to starting work on it; of course, being familiar with prior work can often usefully inform one’s own.
I wonder if this library is something you wanted to build and share regardless, or if you had unsuccessfully looked for something like it previously? Despite good work in the community, discovering appropriate libraries can still be a bit of a challenge sometimes compared to fuller ecosystems.
The library organically grew mostly from my preference and experience of writing recursive descent parsers. Thus the name reparse.
My main motivation was “I like writing recursive descent parsers, how can I make the process of writing them productive and efficient”. The main pain point I initially wanted to address was to somehow abstract/encapsulate the parser state operations, e.g. get next char, is it eof, create lexer/parser buffer and so on. Such that if I wanted to implement a parser for one of the HTTP RFCs, I wouldn’t have to worry about parser state management every time I start work on a new parser.
It was doable and easy enough that I went ahead and did it. v0.0.1(not released) was just that. I used that in a few of my parsers and that worked quite well. From then onward it started taking shape of its own.
As an aside, v1.0.0 was using result type underneath as the central data type. However, I became quite disillusioned with the development experience of it so I removed it in v2.0.0 and used the venerable exception type.
Exceptions vs. results in a library isn’t a massive hurdle, though I personally tend to catch and wrap exceptions as close to their sources as possible (and use exceptions in my own code mostly only for non-local jumps/returns). I think that is mostly orthodoxy these days, though of course tastes vary.
Thank you for having written this down. I came to the same conclusion for similar reasons, plus I really much prefer when the code is not splattered with error passing code (esp. when dealing with several libraries that use different conventions, as you noted).
To me OCaml syntax is lean and light with exceptions but using monadic error handling lower the signal to noise ratio of the code significantly.
If only we had a way to know statically which exceptions can escape any functions that would be the best of both worlds!
For what it’s worth, I remember making a similar choice to use the exn type for signaling errors when I was designing the Orsetto recursive descent parser library. Mostly for the same reasons as you, but also because it facilitated composition of error recovery.
For example, Orsetto makes this function available to parsers:
(** The error check scanner. Use [ck p] to create a new scanner that either
produces either [Ok v] if [p] produces [v] or [Error x] if scanning the
input with [p] raises an exception [x].
val ck: 'r t -> ('r, exn) result t
p.s. I was aware of Angstrom when it was introduced. The origins of Orsetto preceded it. I’ve been supporting Orsetto for a very very long time. I am an old person.
I just finished updating all the copyright notices for 2020, and I’m somewhat knocked on my heels by the fact that some of these files are now old enough to be eligible for the military draft. And when I started messing around with OCaml, the compiler group had already end-of-lifed three previous major versions of the tool chain.
All this is to say— I’m taking this moment to reflect and appreciate the maturity of the OCaml language ecosystem.
On a related note, see Roberto Di Cosmo’s experience with resurrecting OCaml code from 1998. Finding the sources was difficult, but the OCaml 1.07 sources compiled fine with OCaml 4.05, 20 years later.