[ANN] pacomb 1.1

Dear camelers,

I am glad to announce the release 1.1 of pacomb.

defined with the Grammar module (or indirectly though a PPX extension) to
the combinators of the Combinator module. The library offers scanner less
parsing, but the Lex module provide a notion of terminals and blanks
that give a simple way to write grammars in two phases, as usual.

The main advantage of PaComb and similar solutions, contrary to ocamlyacc, is
that grammars (compiled or not) are first class values. This allows using
the full power of OCaml for manipulating grammars. For example, this is very
useful when working with syntax extension mechanisms.

Importantly, the performances of PaComb are very good: it is only two to
five times slower than grammars generated by ocamlyacc, which is a compiler.

Defining languages using the Grammar module directly is cumbersome. For that
reason, PaComb provides a BNF-like PPX syntax extension (enabled using the
-ppx pacomb.ppx compilation flag).

Pacomb also support: self extensible grammars, ambiguous grammars (with merge),
late rejection of rule via raising exception from action code, priority and others.

Documentation is here https://raffalli.eu/pacomb/
github: GitHub - craff/pacomb: A parsing library that compiles grammars to combinators using elimination of left recursion

and it is available via opam install pacomb

As teaser, the usual calculator example:

(* The three levels of priorities *)
type p = Atom | Prod | Sum

let%parser rec
     (* This includes each priority level in the next one *)
     expr p = Atom < Prod < Sum
            (* all other rule are selected by their priority level *)
            ; (p=Atom) (x::FLOAT)                        => x
            ; (p=Atom) '(' (e::expr Sum) ')'             => e
            ; (p=Prod) (x::expr Prod) '*' (y::expr Atom) => x*.y
            ; (p=Prod) (x::expr Prod) '/' (y::expr Atom) => x/.y
            ; (p=Sum ) (x::expr Sum ) '+' (y::expr Prod) => x+.y
            ; (p=Sum ) (x::expr Sum ) '-' (y::expr Prod) => x-.y


Documentation is here https://raffalli.eu/pacomb/
github: https://github.com/craff/pacomb

and it is available via opam install pacomb

1 Like

This looks pretty neat, especially the syntax afforded by the ppx. I wonder if you have any notes re: how it compares to angstrom?

Thanks for your interest.

Briefly: pacomb (short for french “Pas Comb” = “Not Comb”) compiles BNF to combinators but the combinators are
not intended to be used directly. The interface of the Comb module is probably not optimized for this.
With pacomb you compile BNF to combinators (at runtime).

The main idea of pacomb it to provide a lot of nice feature with a very good performance:

  • full context free grammar in O(N^3) using cache and merge combinators.
  • notion of “blank” to eliminate meaningless characters. The blank can be changed during parsing.
  • byte position for speed and full position with utf8 column by rescanning the file.
  • rejection of rule from the action code
  • dependent and extensible grammars (the grammar depends from the beginning of the input)
  • parametric grammar
  • I probably forgot some.

Pacomb is then well suited for languages that embed other languages, natural languages, but standard application works very well too.

The combinator of pacomb are rather unusual: they use continuation and a scheduler (the combinator does not call itself its continuation, it returns to the scheduler that will call the continuation later to ensure that all alternatives are parsed in parallel) hence pacomb parses the buffer from left to right. Pacomb also uses special combinators for the compilation of left recursive grammars. See Comb.ml file.