문제

A stripped down version of the grammar with the conflict:

body: variable_list function_list;
variable_list:
  variable_list variable | /* empty */
;
variable:
  TYPE identifiers ';'
;
identifiers:
  identifiers ',' IDENTIFIER | IDENTIFIER
;
function_list:
  function_list function | /* empty */
;
function:
  TYPE IDENTIFIER '(' argument_list ')' function_body
;

The problem is that variables and functions both start with TYPE and IDENTIFIER, e.g

int some_var;
int foo() { return 0; }

variables are always declared before functions in this language, but when a parse is attempted, it always gives

parse error: syntax error, unexpected '(', expecting ',' or ';' [after foo]

How can the variable_list be made to be less greedy, or have the parser realize that if the next token is a '(' instead of a ';' or ',' it is obviously a function and not a variable declaration?

The bison debug output for the conflict is

state 17

3 body: variable_list . function_list
27 variable_list: variable_list . variable

T_INT    shift, and go to state 27
T_BOOL   shift, and go to state 28
T_STR    shift, and go to state 29
T_VOID   shift, and go to state 30
T_TUPLE  shift, and go to state 31

T_INT     [reduce using rule 39 (function_list)]
T_BOOL    [reduce using rule 39 (function_list)]
T_STR     [reduce using rule 39 (function_list)]
T_VOID    [reduce using rule 39 (function_list)]
T_TUPLE   [reduce using rule 39 (function_list)]
$default  reduce using rule 39 (function_list)

variable       go to state 32
simpletype     go to state 33
type           go to state 34
function_list  go to state 35

I have tried all sorts of %prec statements to make it prefer reduce (although I am not sure what the difference would be in this case), with no success at making bison use reduce to resolve this, and I have also tried shuffling the rules around making new rules like non_empty_var_list and having body split up into function_list | non_empty_var_list function_list and none of the attempts would fix this issue. I'm new to this and I've run out of ideas of how to fix this, so I'm just completely baffled.

도움이 되었습니까?

해결책

the problem is in that variables and functions both start with TYPE and IDENTIFIER

Not exactly. The problem is that function_list is left-recursive and possibly empty.

When you reach the semi-colon terminating a variable with TYPE in the lookahead, the parser can reduce the variable into a variable_list, as per the first variable_list production. Now the next thing might be function_list, and function_list is allowed to be empty. So it could do an empty reduction to a function_list, which is what would be necessary to start parsing a function. It can't know not to do that until it looks at the '(' which is the third next token. That's far too far away to be relevant.

Here's an easy solution:

function_list: function function_list
             | /* EMPTY */
             ;

Another solution is to make function_list non-optional:

body: variable_list function_list
    | variable_list
    ;

function_list: function_list function
             | function
             ;

If you do that, bison can shift the TYPE token without having to decide whether it's the start of a variable or function definition.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top