The string "keyup"
is being tokenized as a NAME
token: that is the problem.
You must realize that the lexer operates independently from the parser. If the parser is trying to match a KEYPRESS
token, the lexer does not "listen" to it, but just constructs a token following the rules:
- match the rule that consumes the most characters
- if there are more rules that match the same amount of characters, choose the one that is defined first
Taking these rules into account, and the order of your rules:
NAME : [A-Za-z_][A-Za-z_0-9]* ;
INT : [0-9]+ ;
KEY : [a-z] | [0-9] | 'shift' | 'ctrl' | 'alt' | 'meta' | 'space' | 'left' | 'right' | 'up' | 'down' | 'minus' | 'equals' | 'backspace' | 'openbracket' | 'closebracket' | 'backslash' | 'semicolon' | 'quote' | 'enter' | 'comma' | 'period' | 'slash' ;
KEYPRESS : 'keyup' | 'keydown' ;
a NAME
token will be created before most of the KEY
alternatives, and all of the KEYPRESS
alternatives will be created.
And since an INT
matches one or more digits and is defined before KEY
which also has a single digit alternative, it is clear that the lexer will never produce a KEY
or KEYPRESS
token.
If you move the NAME
and INT
rule below the KEY
and KEYPRESS
rules, then most of the tokens will be constructed as you expect, is my guess.
EDIT
A possible solution would look like:
KEY : [a-z] | 'shift' | 'ctrl' | 'alt' | 'meta' | 'space' | 'left' | 'right' | 'up' | 'down' | 'minus' | 'equals' | 'backspace' | 'openbracket' | 'closebracket' | 'backslash' | 'semicolon' | 'quote' | 'enter' | 'comma' | 'period' | 'slash' ;
KEYPRESS : 'keyup' | 'keydown' ;
NAME : [A-Za-z_][A-Za-z_0-9]* ;
SINGLE_DIGIT : [0-9] ;
INT : [0-9]+ ;
I.e. I removed the [0-9]
alternative from KEY
and introduced a SINGLE_DIGIT
rule (which is placed before the INT
rule!).
Now create some extra parser rules:
integer : INT | SINGLE_DIGIT ;
key : KEY | SINGLE_DIGIT ;
and change all occurrences of INT
inside parser rules to integer
(don't call your rule int
: it is a reserved word) and change all KEY
to key
.
And you might also want to do something similar to NAME
and the [a-z]
alternative in KEY
(i.e. a single lowercase char would now never be tokenized as a NAME
, always as a KEY
).