For the input "A/KK543/EPOS"
, the following tokens are created:
'A' 'A' SLASH '/' LETTER 'K' LETTER 'K' DIGIT '5' DIGIT '4' DIGIT '3' SLASH '/' 'E' 'E' LETTER 'P' LETTER 'O' LETTER 'S'
But for the input "A/KL543/EPOS"
, these are created:
'A' 'A' SLASH '/' LETTER 'K' 'L' 'L' DIGIT '5' DIGIT '4' DIGIT '3' SLASH '/' 'E' 'E' LETTER 'P' LETTER 'O' LETTER 'S'
As you can see, the char 'L'
does not get tokenized as a LETTER
. For the literal tokens 'A'
, 'E'
, 'L'
and 'N'
inside your parser rules, ANTLR (automatically) creates separate lexer rules that are place before all other lexer rules. This causes your lexer to look like this behind the scenes:
A : 'A';
E : 'E';
L : 'L';
N : 'N';
DIGIT : '0'..'9';
LETTER : 'A'..'Z';
SLASH : '/';
Therefor, any single 'A'
, 'E'
, 'L'
and 'N'
will never become a LETTER
token. This is simply how ANTLR works. If you want to match them as letters, you'll need to create a parser rule letter
and let it match these tokens too. Something like this:
message
: A (SLASH flt)? (SLASH restriction)?
;
flt
: car fnum?
;
fnum
: DIGIT DIGIT DIGIT DIGIT? letter?
;
restriction
: E ap
| L ap
| N
;
ap
: letter letter letter
;
car
: letter letter
;
letter
: A
| E
| L
| N
| LETTER
;
A : 'A';
E : 'E';
L : 'L';
N : 'N';
DIGIT : '0'..'9';
LETTER : 'A'..'Z';
SLASH : '/';
which will parse the input "A/KL543/EPOS"
like this: