I’m learning how to write a parser, and I was following the calc.ml example from Chapter 13 of the manual (OCaml - Lexer and parser generators (ocamllex, ocamlyacc)). The program compiles and runs as expected.
When input contains a a character which is not listed in the lexer rules (semicolon in “2;” below) the program fails:
$ bash calc.sh
11 states, 267 transitions, table size 1134 bytes
2*2
4
2;
Fatal error: exception Failure("lexing: empty token")
What is the idiomatic way of handling this?
I want the program to print an error message which contains invalid token and its position in the input, and not terminate.
Should the surrounding code simply handle the Failure exception?
Should the lexer include a catch-all rule for “invalid token” type, and let parser produce an error message instead?
Can you recommend a source file from a real application or library, that would be a good example to read and follow?