There are quite some things going wrong:
0
What 280Z28 said.
1
'250'..'255'
does not match the strings "250"
... "255"
: you need to match the numeric ranges as described in the original ABNF specs:
ABNF
dec-octet = DIGIT ; 0-9
/ %x31-39 DIGIT ; 10-99
/ "1" 2DIGIT ; 100-199
/ "2" %x30-34 DIGIT ; 200-249
/ "25" %x30-35 ; 250-255
ANTLR
dec_octet
: digit
| non_zero_digit digit
| D1 digit digit
| ...
;
2
You have a lot of conflicting lexer rules. Take these for example:
HEXDIG : [0-9A-F] ;
ALPHA : [a-zA-Z] ;
because HEXDIG
is defined before ALPHA
, the lexer will always create a HEXDIG
when it sees 'A'
, for example. You must realize that the lexer does not produce tokens based on what the parser would like to receive. The lexer will go its own way and will never produce an ALPHA
for the uppercase letters A-F
.
3
fragment
rules can only be used inside other lexer rules (or other fragment
rules). You cannot use them inside parser rules.
4
Not really an issue, but the predicates make your grammar hard to read: if possible try to minimize predicates is my rule of thumb.
Your rule:
h16
locals [int i = 1;]
: ( {$i>=1 && $i<=4}? HEXDIG {$i++;} )* ;
could be written as:
h16
: HEXDIG HEXDIG HEXDIG HEXDIG
| HEXDIG HEXDIG HEXDIG
| HEXDIG HEXDIG
| HEXDIG
;
or even:
h16
: HEXDIG (HEXDIG (HEXDIG HEXDIG?)?)?
;
Most of these issues are easily fixed, but #2 is a more tricky one. What you could (should?) do is let the lexer create single-char tokens and let the parser match these single-char tokens into a whole. An example how you could let the parser match the dec-octet
production from the official ABNF:
dec_octet
: digit // 0-9
| non_zero_digit digit // 10-99
| D1 digit digit // 100-199
| D2 (D0 | D1 | D2 | D3 | D4) digit // 200-249
| D2 D5 (D0 | D1 | D2 | D3 | D4 | D5) // 250-255
;
digit
: D0
| non_zero_digit
;
non_zero_digit
: D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9
;
// lexer rules
D0 : '0';
D1 : '1';
D2 : '2';
D3 : '3';
D4 : '4';
D5 : '5';
D6 : '6';
D7 : '7';
D8 : '8';
D9 : '9';
I've once written an IRI grammar for ANTLR 3. If you want, I could put it in Github somewhere.