Is there a mechanism in Antlr to allow the lexer to match a token only during certain rules?

https://stackoverflow.com/questions/1620011

antlr
lexer

06-07-2019
|

Question

I'd like to add a keyword to my language.

This keyword would only have to be matched during one particular parser grammar rule.

Due to backward compatibility I'd like to allow this keyword to continue to be used as a variable name, ie it can be matched by the lexer rule that determines if a token is suitable for a variable name.

The Lexer matches the new rule whenever it is found in the file.

Is the appropriate way of working around this to modify the var_declaration rule so that it matches either an IDENT or the new KEYWORD tokens?

protected
modified_var_declaration:
     VAR (IDENT|KEYWORD);
;

The relevant rules are:

IDENT   // matches variable names
options { testLiterals=true; }
    : ( '_' | 'a'..'z' | 'A'..'Z' ) ( '_' | 'a'..'z' | 'A'..'Z' | DIGIT )*
;

KEYWORD: // my new keyword
  "key"
;

The parser rule for creating a variable is:

protected
var_declaration:
     VAR IDENT;
;

Solution

Many languages have context-sensitive keywords. The first step to handling them is adding a new parser rule ident representing a variable name. Use that rule in your parser instead of IDENT.

ident
    : IDENT
    | KEYWORD
    ;

OTHER TIPS

Check out http://www.antlr.org/wiki/display/ANTLR3/1.+Lexer which should help you out. You will need to set and unset a flag when "KEYWORD" will be allowed, i.e.

KEYWORD : { keywordcontext }?=> "key";

the trick here would be toggling where you can expect a keyword, which could be less than trivial.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow