I am using the Angstrom library to write a parser for a markdown language. How do I return an informative error message to the user about the location of a parsing error?
Edit: I want to clarify the problem a bit. A markdown format constitutes a few kinds of special syntactic elements that need to be just-so, properly formatted, because these give the document a tree structure. On the other hand a markdown is mostly just raw text.
There is a naive way of writing a parser here which is something like
type doc_element =
| Structured_element of t1 * t2 * t3
| Timestamp of date * time
| Heading of int * string
| Unstructured_raw_paragraph of string
Now I write the parser to look for strings that are of the correct form matching one of these structured elements, something like
let structured_elt_parser : doc_element Angstrom.t = (...);;
let heading_parser : doc_element Angstrom.t = (...);;
let timestamp_parser : doc_element Angstrom.t = (...);;
let unstructured_parser : doc_element Angstrom.t = (...);
let main_loop = Angstrom.choice [structured_elt_parser, heading_parser, timestamp_parser, unstructured_parser]
|> Angstrom.many
This design suffers from the flaw that a small syntax error in a structured element causes the parser to fail, then the unstructured parser will always succeed and classify the text as a raw unstructured paragraph.
I guess I am looking for a form of nonlocal control flow where the failure of a low-level parser can cause a higher level parser to fail on the grounds that the document is ill-formatted. I think just raising an exception would be good for my purposes, but I was wondering if there was a way to do something more idiomatic working inside the parsing library without just jumping out of the parser completely.