use ANTLR grammar to identify different functions(function may have same start item but have keywords in the middle )

StackOverflow https://stackoverflow.com/questions/13786368

Question

I just met a something headache ..

I try to split statement into different functions ,like i have an example statement

start
 n turnTo 's'.
 n terminate.
end

both of statement start with 'n' , currently I am writing

statement 
    :
    (turnTo_statment|terminate_statment)*
    ;

turnTo_statment
    :
    variable 'turnTo' '\'' value '\'' '.'
    ;

terminate_statment
    :
    variable 'terminate' '.'
    ;

but when lexer running , it cannot determined which one is which because both of the substatemts are start with same thing 'n', compiler have alternative choice to use the rules . if next string is not matching the first rule compiler uses then it will automatically throw a no matching error .

how can I identify and told ANTLR if i met 'x turnTo y' then use the rule turnTo_statment , if i met 'x terminated .' then use the rule terminate_statment..

i.e. is there any function in antlr do this..

statement 
    :
    ((if statement contain_keywords 'turnTO') -> turnTo_statment
    |
    (if statement contain_keywords 'terminate') ->terminate_statment)*
    ;

thanks..

Was it helpful?

Solution

First, don't use 'literals' in your parser rules. Without a lot of experience in ANTLR, this will get you in to trouble. Create real lexer rules:

TURNTO: 'turnTo';

Now, you probably need to read through the tutorials on the ANTLR wiki and study the downloadable examples and make sure you understand them. Writing good grammars seems easy because the grammar language is trivial to learn, but in fact it requires quite a lot of knowledge. The first thing to realize though, is that the lexer has no knowledge of the parser - it merely tokenizes the input stream and passes those tokens to the parser - so the lexer patterns cannot be ambiguous - the parser rules can deal with potential differences.

ANTLR can probably handle your grammar without transforming it to LL(1) as ANTLR can handle LL(k) and usually works out what k is without your help. Is this your entire grammar? However, it is always best to left factor anyway:

statement: var ( TURNTO {etc} | TERMINATE DOT )

OTHER TIPS

Your grammar isn't an LL(1) grammar (because, as you noticed, first(turnTo_statment) = first(terminate_statment)). You can, however, transform it into an LL(1) grammar by left-factoring:

statement -> var_stmt statement
var_stmt -> variable turnto_stmt | variable terminate_stmt
turnto_stmt -> "turnTo" value
terminate_stmt -> "terminate."

I don't know much about ANTLR, but this is the traditional way of dealing with these kinds of conflicts.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top