NetBeans lexer throws exception recognizing trailing whitespace

https://stackoverflow.com/questions/20860186

23-09-2022
|

Question

I have a simple grammar written in ANTLR4 that includes (among others) a whitespace rule:

WhiteSpace : [ \t\r\n]+ -> skip;

This is integrated into a NetBeans platform application using org.netbeans.spi.lexer.Lexer. When the input has trailing whitespace (before EOF), I get the following exception:

java.lang.IllegalStateException: Lexer ExpressionLexer@2cdea2eb
  returned null token but lexerInput.readLength()=1
  lexer-state: null
  tokenStartOffset=20, readOffset=21, lookaheadOffset=22
  Chars: "\n" - these characters need to be tokenized.
Fix the lexer to not return null token in this state.

How can I make this trailing whitespace not cause an error?

Edit: This works correctly without error using only ANTLR lexer and parser code. The error is only when integrating with the NetBeans lexer (and possibly other integrations).

Solution

Change the WhiteSpace rule to send the token to a hidden channel rather than skipping altogether.

WhiteSpace : [ \t\r\n]+ -> channel(HIDDEN);

The parser won't see the white space, but the NetBeans lexer will be happy that there is a valid token returned for all the input.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow