Menhir:How to use inline to add locations?

I saw this trick from Your favorite Menhir tricks and fanciness , which is very impressive (cc @EmileTrotignon ):

%public %inline located(X):
x=X { Position.with_poss $startpos $endpos x }

I tried to make a demo to test how to use it, and there’re not so many examples outside the world, except this new syntax one cc @fpottier :

let located(x) ==
  ~ = x; { { loc = $loc; value = x } }

let atomic_expr :=
  | LPAREN; ~ = expr; RPAREN; <>
  | located(
    | ~ = INT; <ELiteral>
    | ~ = unary_op; ~ = atomic_expr; <EUnOp>
    )

I was wondering is there any way to make the located thing work in the old syntax?

Here’re my code:

syntax.ml

type prim2 = Plus | Minus | Times
type 'a located = { loc : Lexing.position * Lexing.position; value : 'a }

type lexpr = raw_expr located
and raw_expr = Num of int64 | EPrim2 of prim2 * lexpr * lexpr

lexer.mll

{
open Parser
}
let whitespace = [' ' '\t' '\n' '\r' ]
let digit = ['0'-'9']
let number = digit+

rule read =
  parse
    | whitespace { read lexbuf }
    | number as lxm { NUMBER (Int64.of_string lxm) }
    | '(' { LPAREN }
    | ')' { RPAREN }
    | '+' { PLUS }
    | eof { EOF }

parser.mly


%{
open Syntax
%}
%token <int64> NUMBER
%token LPAREN
%token RPAREN
%token PLUS
%token EOF

%start <lexpr> start

%%

start:
| e = expr EOF { e}

expr:
| n = NUMBER { Num n }
| LPAREN e=expr RPAREN {e}
| e1=expr PLUS e2=expr { EPrim2 (Plus, e1, e2)}


%inline located(X):
| x=X {{ loc = $loc; value = x }}

I mean, what’s the appropriate way to add located in the expr rule?

I did tried a lot of ways, like:

expr:
| n = NUMBER { Num n }
| LPAREN e=expr RPAREN {e}
| e1=located(expr) PLUS e2=located(expr) { EPrim2 (Plus, e1, e2)}

But got error

Error: This expression should not be a constructor, the expected type is Dune__exe.Syntax.lexpr

It seems to me that the error you have here is related to your ocaml code, and that the menhir part is correct.
Reading more carefully, start is supposed to return a lexpr but actually return a raw_expr, you should also add located to it. If this is not correct, having the location of the error would be quite helpful.

1 Like

There is a big example using this trick in the menhir sources : benchmarks/parsers/houblix/parser.mly · master · POTTIER Francois / menhir · GitLab
This is a parser for an ML language using a syntax more similar to reason than to OCaml.

1 Like