[ANN] Parseff: parser combinator library for OCaml 5

oh… that’s one combinator that I wouldn’t discard adding!

EDIT: Nevermind, the usage is parsera \#or parserb

Yes, don’t disagree, but If there’s combinator style worth adding it can be always added as a separate package or even in user’s code

1 Like

Thanks, would you mind opening a PR? Happy to have a Angstron (optimised) version on the repo!

No, it’s parsera or parserb :slight_smile:

maybe ocamlformat issue? dunno.

That’s a cute suggestion, but I don’t want to add operators in parseff

Share it on https://x.com/davesnx/status/2036739000271007997
Let me expand here:

I’m exploring the idea of making parseff faster, autoresearch suggested to remove effects entirely from parseff and it makes fused operations equal, but primitive operations are much faster.

Kind of loses the purpose of the name tbh, but frankly, the comparison is:

46-75% faster without effects because each combinator call is now a direct function call + DLS.get (~5ns) instead of Effect.perform + handler dispatch + continuation resume (~50-100ns)

  1. This PR Bench angstrom: make fair and angstrom more comparable by reynir · Pull Request #6 · davesnx/parseff · GitHub changed a bit the parseff (fair) version and angstrom to be actually slower
  2. This commit Performance optimizations: +34% json_fused, +48% arith_generic throug… · davesnx/parseff@8d83a14 · GitHub creats an optimised version of anstrom and does micro optimisations on parseff (0.3.0)

So, parseff 0.4.0 will probably ship without effects and be faster than angstron in any case, but for fused operations it will be almost the same.

1 Like

It looks like the alternatives parser will be much longer without specialized effect-based backtracking combinator. Is there any chance to make effects and reimplementation of recursive decent live together?

Also, if we want seriously talk about JSON parsing performance, we need to compare with GitHub - simdjson/simdjson: Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks · GitHub too

The public interface hasn’t changed. What do you mean, exactly?

We aren’t seriously comparing json parsing perf, we are comparing similar libraries.

1 Like

Not really surprising but we should be able to do better than ~50-100ns for the effects version. Could you say which OCaml version you used and what hardware it was on?

Feel free to go deeper with this branch No effects version by davesnx · Pull Request #7 · davesnx/parseff · GitHub and running those benchmarks parseff/bench at 41cc5852b3cbc78e00710b4c6727211d476d7cee · davesnx/parseff · GitHub

OCaml 5.4.0

Hardware

  • CPU: 2x AMD EPYC 9755 128-Core Processor
  • Cores/threads: 256 physical cores, 512 logical CPUs total
  • RAM: 2.2 TiB total (771 GiB in use)
1 Like