Can anyone reproduce this in menhir?

If I want to parse parameters, something like this:
(Tralalero, Tralala, Crocodilo, Bombardilo)

I’d make this menhir rule.

params:
 LPAREN separated_list(COMMA, identifier) RPAREN {}

This works!
However, for whatever reason, I want the user to be obligated to add a comma at the end. Like this:
(Tralalero, Tralala, Crocodilo, Bombardilo,)

The natural thing is to write something like

params:
 LPAREN separated_list(COMMA, identifier) COMMA RPAREN {}

However, this doesn’t work, for whatever reason.

So I attempted something like

parameters:
LPAREN params RPAREN {}

params:
identifier COMMA params
identifier COMMA

Yet this did not work either. Does anybody have an idea on how to do this?

I found this to work:

params:
  LPAREN nonempty_list(terminated(identifier, COMMA)) RPAREN {}

Thanks. I love you :beating_heart:

1 Like

What if I wanted to make it so that the end comma is optional? I attempted something like this

params:
  LPAREN nonempty_list(terminated(identifier, COMMA)) option(COMMA) RPAREN {}

but that gave me a shift reduce conflict

** In state 44, looking ahead at COMMA, reducing production
** separated_nonempty_list(COMMA,use_tree) -> use_tree
** is permitted because of the following sub-derivation:

simple_path_special LBRACE separated_nonempty_list(COMMA,use_tree) option(COMMA) RBRACE // lookahead token appears because option(COMMA) can begin with COMMA
                           use_tree . 

** In state 44, looking ahead at COMMA, shifting is permitted
** because of the following sub-derivation:

simple_path_special LBRACE separated_nonempty_list(COMMA,use_tree) option(COMMA) RBRACE 
                           use_tree . COMMA separated_nonempty_list(COMMA,use_tree)

use_tree is interchangeable with param

The problem is that the parser cannot decide upon seeing the last COMMA if it is the comma following an identifier as part of the list or if it is the optional comma. You should try instead with

params:
  LPAREN separated_nonempty_list(COMMA, identifier) option(COMMA) RPAREN {}

Cheers,
Nicolas

You might be interested in this old post by @fpottier.

1 Like

This gives a shift/reduce error.

  LBRACE;
  trees = separated_nonempty_list(COMMA, use_tree);
  option(COMMA);
  RBRACE

** Conflict (shift/reduce) in state 44.
** Token involved: COMMA
** This state is reached from program after reading:

outer_attrs USE simple_path_special LBRACE use_tree

** The derivations that appear below have the following common factor:
** (The question mark symbol (?) represents the spot where the derivations begin to differ.)

program 
items EOF 
item items 
outer_attrs vis_item 
            use_declaration 
            USE use_tree SEMI 
                (?)

** In state 44, looking ahead at COMMA, reducing production
** separated_nonempty_list(COMMA,use_tree) -> use_tree
** is permitted because of the following sub-derivation:

simple_path_special LBRACE separated_nonempty_list(COMMA,use_tree) option(COMMA) RBRACE // lookahead token appears because option(COMMA) can begin with COMMA
                           use_tree . 

** In state 44, looking ahead at COMMA, shifting is permitted
** because of the following sub-derivation:

simple_path_special LBRACE separated_nonempty_list(COMMA,use_tree) option(COMMA) RBRACE 
                           use_tree . COMMA separated_nonempty_list(COMMA,use_tree)

@grayswandyr gave me the idea of using left recursion, so the first thing that I check is that there is a comma at the end.

use_trees:
  | use_tree_list option(COMMA) { $1 }

use_tree_list:
  | use_tree { [$1] }
  | use_tree_list COMMA use_tree { $3 :: $1 }

This worked!

Just keep in mind that with this technique, the computed list is in reverse order. Depending on your application, you may want to massage it through List.rev.

Thanks for reminding me