Question
I am running into this again and again. To parse {1}SB0$1:U
insied this input S:G$mabit$0$0({1}SB0$1:U),H,0,0
I have these rules here:
/*
* Type Chain Record
*/
type_chain_record
:
'{' number[10] '}' type_dcl_id (',' type_dcl_id)? ':' type_sign
;
type_dcl_id
:
'DA' EXPRESSION 'd' // Array of n elements
| 'DF' // Function
| 'DG' // Generic pointer
| 'DC' // Code pointer
| 'DX' // External ram pointer
| 'DD' // Internal ram pointer
| 'DP' // Page pointer
| 'DI' // Upper 128 byte pointer
| 'SL' // long
| 'SI' // int
| 'SC' // char
| 'SS' // short
| 'SV' // void
| 'SF' // float
| 'ST' EXPRESSION // Structure of name <name>
| 'SX' // sbit
| 'SB' EXPRESSION '$' EXPRESSION // Bit field of n bits
;
type_sign
:
'U' // Unsigned
| 'S' // Signed
;
number[int numbase] returns[long val]
:
n = EXPRESSION
{
$val = Convert.ToInt64($n.text, $numbase);
}
;
// ////////////////////////////////////////////////////////////////////////////
// LEXER RULES
fragment LETTER
:
'a'..'z'
| 'A'..'Z'
;
fragment DIGIT
:
'0'..'9'
;
fragment NONZERO_DIGIT
:
'1'..'9'
;
FILE_SCOPE
:
'L' (LETTER)+ '.' (LETTER)+
;
EXPRESSION
:
(LETTER | DIGIT | '_' )+
;
WS
:
'\r' | '\n'
;
I don't understand why but I am getting a NoViableAltException
saying line x:y no viable alternative at input 'SB0'
.
Could anyone explain me why this is happening? The parser rule type_dcl_id
has unique literals in front of every choice. I don't see why the parser would have troubles at this point.
I added all lexer rules.
Side note:
The reason why I want that granularity and not simple parse over that input is that I want type_dcl_id
later to return an object which shall be propagated up to type_chain_record
and later be used to construct another object ChainRecord
which will hold an object DCLType
.
Solution
SB0
gets tokenized as an EXPRESSION
, because the lexer will match longest possible sequence and obviously SB0
is longer than SB
.
An easy workaround would be to make LETTER
and DIGIT
real lexer rules instead of fragments and exchange the EXPRESSION
lexer rule by the following new parser rule:
expression : (LETTER | DIGIT | '_' )+ ;
For more information you might find this post helpful: https://github.com/antlr/antlr4/issues/485#issuecomment-37284837
OTHER TIPS
| 'SB' EXPRESSION '$' EXPRESSION // Bit field of n bits
does not match SBO.