I’m currently trying to parse a very simple document format using the above tools (and probably more than one document at a time), however all the examples I seem to find are for a calculator directly evaluating expressions. Feel like all my posts on this forum involve the words “maybe I’m a complete idiot” but
Example:
So let’s simplify my problem even more, let’s say I want parse a list of double quoted strings separated by ','.
How would I define this grammar, do I need ocamllex? Can I do this entirely inline (as the docs vaguely hint at, and a few nearly decade old posts)? Can somebody give me a simple example?
More Questions:
What is the recommended way to use ocamllex and menhir (ocamlyacc) together (not as a calculator lol)?
With dune especially!
What am I missing?
And where can I get more guidance or examples to read through.
Relevant files are probably lexer.mll, parser.mly and dune.
Although more recently I’ve started using sedlex instead of ocamllex for my lexers because sedlex allows me to use normal OCaml syntax (and hence + merlin + gopcaml-mode for editing).
For other examples of Menhir parsers, the OCaml parser is also written in menhir and might be worth checking out as well.
Yeah, good point - it does require a bit more boilerplate, although not too much nowadays.
This was all the extra code I needed to glue between the two interfaces for a recent project:
exception Error
let revised_parse lexbuf =
let tok () =
let tok = Lexer.token lexbuf in
let (st,ed) = Sedlexing.lexing_positions lexbuf in
(tok,st,ed) in
MenhirLib.Convert.Simplified.traditional2revised
Raw_parser.program
tok
let parse lexbuf =
try
revised_parse lexbuf
with Raw_parser.Error -> raise Error
let parse_string str =
parse (Sedlexing.Utf8.from_string str)
This forces you to use the table backend though ? This backend is quite slower (does not make a difference for most uses). I believe there should be a way to provide an “abstract” buffer to menhir, that is a function of type unit -> token and two of type unit -> position. But its not available yet.
nice-parser has an example of how to parse S-expressions. It’s a tiny library that encapsulates all the typical boilerplate code, so you can just focus on lexing and parsing.