Debugging Angstrom (or parser combinators in general)

Hello.

I’m trying to debug a parser that I wrote, but I have no idea how to do it. Is there any resource that I can read upon on about how to debug a parser written with combinators?

My current issue is about (I think) a left recursive grammar, but not sure how to solve it. The grammar is simple lambda calculus with variables, lambdas, and applications. Currently I have

let alpha =
  take_while1 (fun c ->
    match c with
    | 'a' .. 'z' | 'A' .. 'Z' -> true
    | _ -> false)
;;
let parse_var = ws *> alpha >>| fun x -> IRVar x

let expr : ir_term t =
  fix (fun expr ->
    let lambda =
      lrstring "lambda" *> alpha
      >>= fun arg -> lrchar '.' *> expr >>| fun e -> IRAbs (arg, e)
    in
    let p1 = choice [ parens expr; lambda; parse_var ] in
    let appl = p1 >>= fun e1 -> ws *> p1 >>| fun e2 -> IRApp (e1, e2) in
    appl <|> p1)
  <* sc
  <* ws
;;

If you’re curious you can find definitions of helper functions in this post. The current code works fine with this input:

(lambda x. x) (lambda x. x x); 

Problem arises when I have more than 2 lambdas nested

lambda s. lambda n. lambda m. n m s; 

The result of this comes out as

(Ast.IRAbs ("s",                    
   (Ast.IRAbs ("n",
      (Ast.IRApp (
         (Ast.IRAbs ("m", (Ast.IRApp ((Ast.IRVar "n"), (Ast.IRVar "m"))))),
         (Ast.IRVar "s")))
      ))
   ))

Which is wrong… I have no idea why this happens, since it works fine with 2 lambdas nested.

One debugging strategy is to test it on smaller expressions to identify where it goes wrong. Have you tried parsing n m s? (without the surrounding lambdas)

Interesting, I actually haven’t tried it, and I just did and it gave an end_of_input error, which means parser wasn’t able to parse anything… The problem should be on the logic where I parse application then

I found the problem. The p1 parser in the code

let p1 = choice [ parens expr; lambda; parse_var ] in

that is used to parse individual parts of application cannot be an application itself, thus it doesn’t work with m n s.

If I include appl recursively, the parser runs infinitely. Is there a way to solve this?

I would recommend to replace recursion by iteration (many combinator).

I actually solved this using chainl1 combinator:

let chainl1 e op =
  let rec go acc =
    (lift2 (fun f x -> f acc x) op e >>= go) <|> return acc in
  e >>= fun init -> go init

Now I have

let atom = choice [ parens expr; lambda; parse_var ] in
chainl1 atom app

where

let app = ws *> return (fun x y -> IRApp (x, y))

I found the use of chainl1 in Monadic Parsing Combinator article (pg. 24), and implementation of chainl1 in README of Angstrom