Using Menhir’s $startpos and $endpos values I was able to tag my AST nodes with location information, then use them to report errors in my code.
To help with this, I created these helper functions and they work well,
let location (x: Lexing.position) =
let lnum = x.pos_lnum in
let cnum = x.pos_cnum - x.pos_bol in
let loc = Ast.Loc{lnum=lnum; cnum=cnum} in
loc
;;
let position (startpos: Lexing.position) (endpos: Lexing.position) =
let start = location startpos in
let end' = location endpos in
let pos = Ast.Pos{start=start; end'=end'} in
pos
;;
My issue is, to use this position function, I need to explicitly pass $startpos and $endpos everywhere in my parser grammar, while calling the position function.
I don’t mind doing this, since it’s stable, and works well for my task.
But it would be nice if there was some way to simplify the pos=(position $startpos $endpos) into something simple like pos=p where p would contain the value returned from the position function.
My intention is to reduce the amount of boiler plate code required to write my parser.
Another way that I would tackle this problem is by simply using Menhir’s $loc (a shorthand for ($startpos, $endpos)), and defining your AST similar to:
type position = Lexing.position * Lexing.position
type expr =
| Id of {value: string; pos: position}
| ... of { ... ; pos: position }
| ... of { ... ; pos: position }
id:
| i=IDVAL; { Ast.Id {value=i; pos=$loc} }
If you only need the position information for reporting errors, then you can save calling your position function for when you construct an error message. Another reason to save calling position for when you construct an error message is that you’ll need to do this anyway for errors raised by the parser itself.
More generally, beyond locations and error messages, I think it’s easiest when your parser mostly just constructs data with minimal extra logic, and any enhancements to it are done by the compiler’s later steps.
let location (x: Lexing.position) =
let lnum = x.pos_lnum in
let cnum = x.pos_cnum - x.pos_bol in
let loc = Ast.Loc{lnum=lnum; cnum=cnum} in
loc
;;
let position (startpos: Lexing.position) (endpos: Lexing.position) =
let start = location startpos in
let end' = location endpos in
let pos = Ast.Pos{start=start; end'=end'} in
pos
;;
let pos (loc: Lexing.position * Lexing.position) =
let startpos, endpos = loc in
position startpos endpos
;;
id:
| i=IDVAL; { Ast.Id {value=i; pos=(pos $loc)} }