when I try the command menhir entry_tokenizer/parser.mly,
I get the warning :
...
Warning: production nonempty_list(STRING) -> STRING is never reduced.
Warning: in total, 1 production is never reduced.
...
The types of the following nonterminal symbols are unknown:
nonempty_list(STRING)
I do not get what is wrong with this parser since the rule “vl = nonempty_list(STRING) { vl }” is unambiguous in my opinion ?
prog:
| v = list_tokens; EOF { Some v }
| EOF { None } ;
an implicit empty case between : and |? I would expect | to separate productions and in that case you have three and not two, as you might assume. For example, I am writing my productions for ocamlyacc like this:
It is not a question of ambiguity. If your grammar is given by
prog:
| nonempty_list(STRING)
| EOF
then it accepts the language
EOF
STRING
STRING STRING
STRING STRING STRING
etc
After seeing a STRING, the automaton has to decide whether to request more input in the hopes of shifting a new STRING into the stack or reduce the top STRING via nonempty_list(STRING) -> STRING. However, Menhir (and OCamlYacc) always choose to request more input in this case so the reduction is never used (and so parsing will fail when the input runs out). See “end-of-stream conflicts” in the Menhir manual (section 6.4):
By adding an explicit EOF token:
prog:
| nonempty_list(STRING) EOF
| EOF
the automaton will reduce nonempty_list(STRING) -> STRING when the lookahead token is EOF, and then keep reducing all the way up to prog.