Why a handwritten parser?

This is a really fraught area. I myself am a confirmed fan of LL(1) grammars, b/c they’re extensible, correspond to my intuition about languages, and during language-development, error-detection is a local phenomenon. But OTOH

  1. LL(1) isn’t sufficient to cleanly express a grammar for the modern OCaml language – there are quite a few spots where one has to use >1-token lookahead. Manifestly LALR(1) -is- enough to express a grammar for that language.
  2. A lot of the grief LALR(1) gets is (I suspect) due to the fact that the grammar-compilation algorithm is global, so its errors are … impenetrable. You get a conflict, and it doesn’t really tell you where or why the error happened. To figure it out, you have to re-run the LR algorithm and see where things went wrong. but this is remediable – Modern versions of Menhir, I have been told, do a much, much better job of leading you to the actual rules which conflict, and giving feedback for why a conflict arose.
  3. predictive (LL(1)) parsers give “better” errors when confronted with bad input: but this, again, is a place where systems like Menhir have made great progress. I don’t know whether bison has made such progress, and maybe that’s why people rewrote the C++ parser from bison to recursive-descent. If that’s the case, it’s a real pity.

I’m a confirmed fan of LL(1) [and, someday, LL(k)]. But even I can see that many of the knocks that LALR(1) parser-generators get are based on older implementations.

1 Like