AFAIK, there's no way to handle this in your lexer grammar without some manual code (which is IMHO better than promoting comments to the parser!).
What you could do is this:
- match
'--'
- in a custom method, manually look ahead until the end of the line (EOL). Let this method return
true
when the'--'
is part of a directive- if what you matched until the EOL looks to be a directive, do NOT match the characters and return
true
- if what you matched until the EOL isn't a directive, match the characters and return
false
- if what you matched until the EOL looks to be a directive, do NOT match the characters and return
- if your custom method returned false, it must be a comment and you can
skip()
it
A quick demo:
grammar T;
@lexer::members {
private boolean directiveAhead() throws MismatchedTokenException {
StringBuilder b = new StringBuilder();
for(int ahead = 1; ; ahead++) {
// Grab the next character from the input.
int next = input.LA(ahead);
// Check if we're at the EOL.
if(next == -1 || next == '\r' || next == '\n') {
break;
}
b.append((char)next);
}
if(b.toString().trim().matches("\\w+:\\w+")) {
// Do NOT let the lexer consume all the characters, just return true!
return true;
}
else {
// Let the lexer consume all the characters!
this.match(b.toString());
return false;
}
}
}
parse
: directive EOF
;
directive
: DIRECTIVE_START IDENTIFIER COL IDENTIFIER
;
IDENTIFIER
: ('a'..'z' | 'A'..'Z')+
;
DIRECTIVE_START
: '--' { if(!directiveAhead()) skip(); }
;
COL
: ':'
;
SPACES
: (' ' | '\t' | '\r' | '\n')+ {skip();}
;