Resolving shift/reduce conflict in Menhir grammar

Hey everyone,
I’m trying to write a Menhir grammar to parse simple expressions in a C-like language, but I’m having some trouble resolving one particular shift/reduce conflict.

I want to be able to parse ‘function calls’:

expression ( expression )

as well as unary operators:

*expression

However, I want the unary operator rule to have a higher precedence than the ‘function call’ rule, so
*expression(expression) is equivalent to (*expression)(expression).

Here’s a simplified version of my grammar:

%token OPERATOR IDENTIFIER EOF LPAREN RPAREN

%start <unit> term

%%

term: expr EOF { () }
;

expr:
  IDENTIFIER { () }
| LPAREN expr RPAREN { () }
| OPERATOR expr { () }
| expr LPAREN expr RPAREN { () }
;

This produces one shift/reduce conflict, and since the default action is to shift, *expression(expression) is parsed as *(expression(expression)).

I’ve tried adding %prec declarations:

%nonassoc low
%nonassoc high

...

| OPERATOR expr %prec high { () }
| expr LPAREN expr RPAREN %prec low { () }

But Menhir says that the %prec declarations are never useful.

How would I force Menhir to reduce OPERATOR expr instead of shifting?

1 Like

I resolved this by splitting up the expr rule into two:

op_expr:
  IDENTIFIER         { () }
| LPAREN expr RPAREN { () }
| OPERATOR op_expr   { () }
;

expr:
  op_expr                 { () }
| expr LPAREN expr RPAREN { () }
;

I might be wrong, but I think what menhir does for shift/reduce conflicts is compare the precedence of a production to that of a token, as opposed to comparing two productions which is what it does for reduce/reduce conflicts. So you should have something like this where an actual token is referred to:

%nonassoc LPAREN
%nonassoc high

...

| OPERATOR expr %prec high { () }

I didn’t test that, though.

Of course if you managed to fix this in the grammar it’s probably more future-proof, precedence levels are such a pain to get right.