A cheat that I've used in the past is to co-opt the default line/column information for an offset. If you don't need the line/column information, you can do something like this:
options {
COMMON_TOKEN_ACTION = true;
}
...
TOKEN_MGR_DECLS : {
static long offset = 0;
static void CommonTokenAction(Token t) {
// Poor-man's re-initialization.
if ((t.beginLine == 1) && (t.beginColumn == 0)) { offset = 0; }
offset += t.image.length();
t.beginLine = (int)(offset >> 32);
t.endLine = (int)(offset);
}
}
Neither the token manager nor the parser rely on line/column information, so this is safe to do. The offset information of a token t
can be likewise recovered.
If you do need to preserve the line/column information, you can specify a base class for the token type, along with a token factory:
options {
TOKEN_EXTENDS = "my.AbstractToken";
TOKEN_FACTORY = "my.TokenFactory";
}
...
Define the base token class:
package my;
public abstract class AbstractToken {
private long offset;
protected AbstractToken() {
// The offset hasn't been initialized.
offset = -1;
}
public long getOffset() { return this.offset; }
void setOffset(long offset) { this.offset = offset; }
}
And define the token factory:
package my;
public class TokenFactory {
private static long offset = 0;
public static Token newToken(int kind, String image) {
Token token = new Token(kind, image);
token.setOffset(offset);
offset += image.length();
}
}
You'll have to reset the offset manually for the next parse. I've glossed over some of the other details, but it's worth noting that any SKIP
definitions should be converted to SPECIAL_TOKEN
definitions, in order to advance the offset for otherwise ignored whitespace.