I have a grammar and everything works fine until this portion:

lexp
: factor ( ('+' | '-') factor)* 
;

factor :('-')? IDENT;

This of course introduces an ambiguity. For example a-a can be matched by either Factor - Factor or Factor -> - IDENT

I get the following warning stating this:

[18:49:39] warning(200): withoutWarningButIncomplete.g:57:31: 
Decision can match input such as "'-' {IDENT, '-'}" using multiple alternatives: 1, 2

How can I resolve this ambiguity? I just don't see a way around it. Is there some kind of option that I can use?

Here is the full grammar:

program 
    : includes decls (procedure)*
    ;

/* Check if correct! */
includes
    :  ('#include' STRING)* 
    ;

decls
    : (typedident ';')*
    ;

typedident
: ('int' | 'char') IDENT    
;

procedure
    : ('int' | 'char') IDENT '(' args ')' body
    ;

args
: typedident (',' typedident )*  /* Check if correct! */
|   /* epsilon */ 
;

body
: '{' decls stmtlist '}'
;   

stmtlist
: (stmt)*;

stmt  

:  '{' stmtlist '}'
| 'read' '(' IDENT ')' ';'
| 'output' '(' IDENT ')' ';'
| 'print' '(' STRING ')' ';'
| 'return' (lexp)* ';'
| 'readc' '(' IDENT ')' ';'
| 'outputc' '(' IDENT ')' ';'
|  IDENT '(' (IDENT ( ',' IDENT )*)? ')' ';'
| IDENT '=' lexp ';';



lexp
: term (( '+' | '-' ) term) * /*Add in | '-'  to reveal the warning! !*/
;

term 
    : factor (('*' | '/' | '%') factor )*
;  


factor : '(' lexp ')' 
 | ('-')? IDENT
 | NUMBER;




fragment DIGIT
: ('0' .. '9')
;

IDENT : ('A' .. 'Z' | 'a' .. 'z') (( 'A' .. 'Z' | 'a' .. 'z' | '0' .. '9' | '_'))* ;


NUMBER
: ( ('-')? DIGIT+)
;

CHARACTER
: '\'' ('a' .. 'z' | 'A' .. 'Z'  | '0' .. '9' | '\\n' |  '\\t'  | '\\\\'  | '\\'  | 'EOF'  |'.' | ',' |':'  )  '\'' /* IS THIS COMPLETE? */
;
有帮助吗?

解决方案

As mentioned in the comments: these rules are not ambiguous:

lexp
 : factor (('+' | '-') factor)* 
 ;

factor : ('-')? IDENT;

This is the cause of the ambiguity:

'return' (lexp)* ';'

which can parse the input a-b in two different ways:

  1. a-b as a single binary expression
  2. a as a single expression, and -b as an unary expression

You will need to change your grammar. Perhaps add a comma in multiple return values? Something like this:

'return' (lexp (',' lexp)*)? ';'

which will match:

return;
return a;
return a, -b;
return a-b, c+d+e, f;
...
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top