Hello,
I’m using OCaml Yacc to parse but, while compiling, I get a warning
Warning: 9 states have shift/reduce conflicts.
Warning: 2 states have reduce/reduce conflicts.
Warning: 68 shift/reduce conflicts were arbitrarily resolved.
Warning: 4 reduce/reduce conflicts were arbitrarily resolved.
My parser looks like this
%{
open Ast
%}
%token <float> FLOAT
%token <int> INT
%token <string> LETTER
%token <string> BFUNC
%token POW MULT DIV PLUS MINUS
%token LPAREN RPAREN
%token EOF
%left PLUS MINUS
%left MULT DIV
%left PRODUCT
%right POW
%right BFUNC
%nonassoc SIGN
%start main
%type <Ast.math_ast> main
%%
main:
| expr EOF { $1 }
;
expr:
| INT { Int $1 }
| FLOAT { Float $1 }
| LETTER { Letter $1 }
| LPAREN expr RPAREN { $2 }
| BFUNC expr { FuncApp ($1, $2) }
| expr POW expr { ExpoApp ($1, $3) }
| expr MULT expr { Mult ($1, $3) }
| expr DIV expr { Mult ($1, ExpoApp ($3, Int (-1))) }
| expr PLUS expr { Add ($1, $3) }
| expr MINUS expr { Add ($1, Mult (Int (-1), $3)) }
| MINUS expr %prec SIGN { Mult (Int (-1), $2) }
| PLUS expr %prec SIGN { $2 }
| expr expr %prec PRODUCT { Mult ($1, $2) }
;
The parser.output file provided by the command ocamlyacc -v parser.mly
is a joke, it’s unreadable.
Your grammar has reduce-reduce conflicts stemming from the last rule expr -> expr expr. For example, expr PLUS expr can be reduced and produce either a “Mult” or an “Add”. These kinds of conflicts cannot be resolved by using precedences, and you need to adapt your grammar to avoid them.
In general ocamlyacc and menhir conflict mechanism is very much similar to that of Unix yacc or bison and it doesn’t involve anything specific to OCaml, so I would recommend reading up on any documentation on those tools to learn more.
If you use menhir, there is a flag --explain that outputs an explanation of any conflict found. Its not always easy to understand but it should be better than with ocamlyacc.
I think what @nojb is trying to say, is that there is a basic understanding of how to write LR(1) (in the case of Menhir and Yacc) grammars, that you probably should gain. Such an understanding is necessary to write a nontrivial grammar while avoiding parsing conflicts. In this case, probably any book on compilers should have a chapter on the subject: it is one of the most important parts of a typical undergraduate course in compilers.
I read the menhir manual and came up with something that seems to work. For anyone interested in, this code should parse mathematical expressions :
%{
open Ast
let neg ast = match ast with
| Int n -> Int (-n)
| Float x -> Float (-.x)
| _ -> Mult (Int (-1), ast)
%}
%token <float> FLOAT
%token <int> INT
%token <string> LETTER
%token <string> BFUNC
%token POW MULT DIV PLUS MINUS
%token LPAREN RPAREN
%token EOF
%left PLUS MINUS
%left MULT DIV
%nonassoc BFUNC
%right LETTER
%start main
%type <Ast.alg_ast> main
%%
/* an expression with implicit product property (ipp) to the right
can be concatenated with another expression with ipp to the left
to form an implicit product*/
main: /*output*/
| expr EOF { $1 }
;
unumber: /*expression that is a positive number. ipp to the right*/
| INT { Int $1 }
| FLOAT { Float $1 }
;
number: /*expression that is a real number*/
| unumber { $1 }
| PLUS unumber { $2 }
| MINUS unumber { neg $2 }
;
factor: /*expression that has ipp to the right and to the left*/
| LETTER { Letter $1 }
| LPAREN expr RPAREN { $2 }
| BFUNC number { FuncApp ($1, $2) }
| BFUNC number LETTER { FuncApp ($1, Mult ($2, Letter $3)) }
| BFUNC signed_factor { FuncApp ($1, $2) }
;
signed_factor:
| factor { $1 }
| PLUS factor { $2 }
| MINUS factor { neg $2 }
;
uexponent: /*unsigned expression that can be an exponent*/
| factor { $1 }
| unumber { $1 }
| factor POW exponent { ExpoApp ($1, $3) }
| unumber POW exponent { ExpoApp ($1, $3) }
;
exponent: /*signed expression that can be an exponent*/
| uexponent { $1 }
| PLUS uexponent { $2 }
| MINUS uexponent { neg $2 }
;
rfactor: /*expression that has ipp to the left*/
| factor { $1 }
| factor POW exponent { ExpoApp ($1, $3) }
;
factorS: /*expression containing multiple factors and end with an rfactor. ipp to the left*/
| rfactor { $1 }
| factor factorS { Mult ($1, $2) }
;
uexpr: /*unsigned expression that usually can't be implicitly multiplied*/
| unumber { $1 }
| factorS { $1 }
| unumber factorS { Mult ($1, $2) }
| unumber POW exponent { ExpoApp ($1, $3) }
| uexpr PLUS uexpr { Add ($1, $3) }
| uexpr MINUS uexpr { Add ($1, neg $3) }
| uexpr MULT uexpr { Mult ($1, $3) }
| uexpr DIV uexpr { Mult ($1, ExpoApp ($3, Int (-1))) }
;
expr: /*signed expression that usually can't be implicitly multiplied*/
| uexpr { $1 }
| PLUS uexpr { $2 }
| MINUS uexpr {
match $2 with
| Add (a, b) -> Add (neg a, b)
| Mult (a, b) -> Mult (neg a, b)
| _ -> neg $2
}
;