previous token on yacc/lex

https://stackoverflow.com/questions/21555990

06-10-2022
|

سؤال

I am coding some SQL manipulation and I am using the language definition linked in this post: SQL lex yacc grammar

I am seeing that when I try to invoque a function on detecting the table name of an insert statement I get the following token on yytext. My code calls the function like this:

insert_statement:
        INSERT INTO table opt_column_commalist values_or_query_spec
    ; 
[..]
table:
        NAME {setGlobalTablename(yytext);}
    |   NAME '.' NAME
    ;
[..]

When I run it with a valid query like this:

INSERT INTO users (uid,username,ocupation,age) VALUES (1,'john','tech','30');

The token I get in the function called in the yytext value is '('.

I understand with this that to conclude that the token is a table it must read the next token (in this case '(' ), but I don't know how to get the real table name (users). In fact, it is not the previous token, but two earlier, as the space also is considered a token.

I don't know if I am missunderstanding something but I can't find the way of geting the name.

المحلول

Don't read yytext in the parser -- as you've discovered it can be unpredictable which token it contains, as the parser may read ahead a token to decide when to shift or reduce.

Instead, you should read yytext in the lexer (and only in the lexer) and make a copy of it if you'll need the value in the parser. So you end up with a lexer rule like:

[a-zA-Z][a-zA-Z0-9]*    { yylval.str = strdup(yytext); return NAME }

and in your parser:

%union {
    char *str;
      :
}
%token <str> NAME
     :
%%
     :
table:  NAME { setGlobalTableName($1); }
     |  NAME '.' NAME  ...

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow