Question

I'm trying to implement a expression handling grammar (that deals with nested parenthesis and stuff). I have the following so far, but they can't deal with some cases (successful/failure cases appear after the following code block). Anyone know what's going on?

Note: The varname += and varname = stuff are just some additional AST generation helper stuff in XText. Don't worry about them for now.

...

NilExpression returns Expression:
  'nil';

FalseExpression returns Expression:
  'false';

TrueExpression returns Expression:
  'true';

NumberExpression returns Expression:
  value=Number;

StringExpression returns Expression:
  value=STRING; //EllipsesExpression: '...';
//FunctionExpression: function=function; //don't allow random functions


UnaryExpression:
  op=unop ('(' expr=Expression ')')|expr=Expression;

BinaryExpression:
  'or'? AndOp; //or op

AndOp:
  'and'? ComparisonOp;

ComparisonOp:
  ('>'|'<'|'>='|'<='|'=='|'~=')? ConcatOp;

ConcatOp:
  '..'? AddSubOp;

AddSubOp:
  ('+' '-')? MultDivOp;

MultDivOp:
  ('*' '/')? ExpOp;

ExpOp:
  '^'? (('(' expr=Expression ')')|expr=Expression);

ExprSideOne : Variable|NilExpression|FalseExpression|TrueExpression|
  NumberExpression|StringExpression|UnaryExpression;

Expression:
  ( 
   '('
  expression1=ExprSideOne expression2+=BinaryExpression*
   ')' 
  )
  |
  ( expression1=ExprSideOne expression2+=BinaryExpression* )
;
...

And here's the list of parses/fails:

c = ((b)); //fails
c = ((a not b)); //fails
c = b; //parses
d = (b); //parses
Was it helpful?

Solution

What's going on is that your Expression/Expressions support single parentheses but not multiple parentheses (as you concluded). I don't have ANTLR specific experience but I've worked with Javacc which shares many similar concepts (I wrote a grammar for Prolog... don't ask).

To handle nested parentheses, you typically have something similar to:

ParenthesisExpression: '(' (ParenthesisExpression | Expression) ')';

This would mean that the expression is either wrapped in parentheses or it's just a raw expression. As for how the AST deals with this, a ParenthesisExpression 'is a' Expression, so it can be represented as a subclass or an implementation of (if Expression is an interface/abstract class of sorts).

OTHER TIPS

Make use of the ^ placed after the token/rule name is very useful for defining expressions.

    expression :    e1 (OR^ e1)* ;
    e1  :   e2 (AND^ e2)*;
    e2  :   e3 (PIPE^ e3)*;
    e3  :   e4 (ANDSYMB^ e4)*;
    e4  :   e5 ((EQUAL^|NOTEQUAL^) e5)*;
    e5  :   e6 ((LESS^|GREATER^) e6)*;
    e6  :   e7 ((PLUS^|MINUS^) e7)* ;
    e7  :   e8 ((STAR^|SLASH^) e8)* ;
    e8  :   e9 (NEW^ ID LPAREN RPAREN)*;
    e9  :   (NOT^)? e10;
    e10 :   e11 | call_def;
    e11 :   constant 
        | '(' expression ')' -> expression;

I've used this grammar for simple expressions: http://fisheye2.atlassian.com/browse/~raw,r=5175/antlr-examples/C/polydiff/Poly.g

I've also used the grammar contain in this project for more complex expressions: http://www.codeproject.com/KB/recipes/sota_expression_evaluator.aspx

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top