Question

I'm trying to extend the grammar of the Tiny Language to treat assignment as expression. Thus it would be valid to write

a = b = 1; // -> a = (b = 1)
a = 2 * (b = 1); // contrived but valid
a = 1 = 2; // invalid

Assignment differs from other operators in two aspects. It's right associative (not a big deal), and its left-hand side is has to be a variable. So I changed the grammar like this

statement: assignmentExpr | functionCall ...;

assignmentExpr: Identifier indexes? '=' expression;

expression: assignmentExpr | condExpr;

It doesn't work, because it contains a non-LL(*) decision. I also tried this variant:

assignmentExpr: Identifier indexes? '=' (expression | condExpr);

but I got the same error. I am interested in

  • This specific question
  • Given a grammar with a non-LL(*) decision, how to find the two paths that cause the problem
  • How to fix it
Was it helpful?

Solution

The key here is that you need to "assure" the parser that inside an expression, there is something ahead that satisfies the expression. This can be done using a syntactic predicate (the ( ... )=> parts in the add and mult rules).

A quick demo:

grammar TL;

options {
  output=AST;
}

tokens {
  ROOT;
  ASSIGN;
}

parse
  :  stat* EOF -> ^(ROOT stat+)
  ;

stat
  :  expr ';' -> expr
  ;

expr
  :  add
  ;

add
  :  mult ((('+' | '-') mult)=> ('+' | '-')^ mult)*
  ;

mult
  :  atom ((('*' | '/') atom)=> ('*' | '/')^ atom)*
  ;

atom
  :  (Id -> Id) ('=' expr -> ^(ASSIGN Id expr))?
  |  Num
  |  '(' expr ')' -> expr
  ;

Comment : '//' ~('\r' | '\n')* {skip();};
Id      : 'a'..'z'+;
Num     : '0'..'9'+;
Space   : (' ' | '\t' | '\r' | '\n')+ {skip();};

which will parse the input:

a = b = 1;       // -> a = (b = 1)
a = 2 * (b = 1); // contrived but valid

into the following AST:

enter image description here

OTHER TIPS

I think you can change your grammar like this to achieve the same, without using syntactic predicates:

statement: Expr ';' | functionCall ';'...;

Expr: Identifier indexes? '=' Expr  |  condExpr ;

condExpr: .... and so on;

I altered Bart's example with this idea in mind:

grammar TL;

options {
  output=AST;
}

tokens {
  ROOT;
}

parse
  :  stat+ EOF -> ^(ROOT stat+)
  ;

stat
  :  expr ';' 
  ;

expr
  : Id Assign expr -> ^(Assign Id expr)

  | add
  ;

add
  :  mult (('+' | '-')^ mult)*
  ;

mult
  :  atom (('*' | '/')^ atom)*
  ;

atom
  :  Id
  |  Num
  |  '('! expr ')' !
  ;

Assign  :   '=' ;
Comment : '//' ~('\r' | '\n')* {skip();};
Id      : 'a'..'z'+;
Num     : '0'..'9'+;
Space   : (' ' | '\t' | '\r' | '\n')+ {skip();};

And for the input:

a=b=4;
a = 2 * (b = 1);

you get following parse tree: enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top