Pergunta

OK, so I'm experimenting with token aliases and facing some issues.

Let's take this part of my (ultra-simplified) Bison grammar as an example :

/****************************************
 Definitions
 ****************************************/

%union 
{
    char* str;
}

/****************************************
 Tokens & Types
 ****************************************/

%token <str> ID "identifier"
%token <str> NUMBER_DEC "number" 
%token <str> NUMBER_HEX "number" 
%token <str> NUMBER_BIN "number"
%token <str> NUMBER_FLOAT "number"

%type <str> identifier number
%type <str> assignment_st
%type <str> statements statement
%type <str> program

/****************************************
 Directives
 ****************************************/

%glr-parser
%locations
%start program
%define parse.error verbose
%%


/****************************************
 Grammar Rules
 ****************************************/

identifier          :   ID
                    ;

number              :   NUMBER_DEC
                    |   NUMBER_HEX
                    |   NUMBER_BIN
                    |   NUMBER_FLOAT
                    ;

assignment_st       :   identifier '=' number ';'                   { printf("assignment : %s = %s\n",$identifier,$number); }
                    ;

statement           :   assignment_st
                    ;

statements          :   statement
                    |   statements statement
                    ;

program             :   statements
                    ;

%%

Now, if I try a = 2;, this is obviously ok with the grammar. If I try a = b; this is an error as it expects a number. In this case, the parser reports :

syntax error, unexpected identifier, expecting number or NUMBER_HEX or NUMBER_BIN or NUMBER_FLOAT

(Well, the "number" alias is a duplicate since it's used in 4 tokens).

However, I'd be looking for something more like unexpected identifier, expected number.

How would you go about it?

Also, is there any chance I could incorporate the error line as well in the message?


P.S. I have been looking into the latest Bison documentation for hours, but I feel as if I'll end up building a... rocket instead of fixing the error messages... lol

Foi útil?

Solução

How would you go about it?

I'd use a single NUMBER token. I don't see any reason to make the parser care which type of numeric literal it's looking at.

Of course, it's possible that your full grammar does actually involve places where only certain formats of numeric literals are allowed, although my general inclination about that sort of thing is "yuk". The most likely possibility is that there is some rule in which an integer constant is ok, but a floating point constant is not. In that case, you're not going to get good error messages for that particular production unless you provide a different alias for floating point numbers than for integers. On the whole, though, I'm sticking with "yuk".

If you have a single numeric token type, with alias "number", then the errors should work out fine.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top