getting a syntax error when parsing assignment statement in yacc

https://stackoverflow.com/questions/21800758

12-10-2022
|

سؤال

[Caution Homework]

I need a hint as to why the following code is failing when I try to run the test case: int a = 3; All the code compiles with no warnings or errors, and as far as I can tell the structure is correct.I feel there must be a problem in the rules for assignment.

The error message says: ERROR: syntax error at symbol "a" on line 1

This is the .lex file:

%{

#include "calc.h"
#include "symbol-table.h"
#include "tok.h"
int yyerror(char *s);
int yylinenum = 1;
%}

digit        [0-9]
int_const    {digit}+
float_const  {digit}+[.]{digit}+
id           [a-zA-Z]+[a-zA-Z0-9]*

%%

{int_const}    { yylval.int_val = atoi(yytext); return INTEGER_LITERAL; }
{float_const}  { yylval.float_val = atof(yytext); return FLOAT_LITERAL; }
"="            { yylval.str_val = strdupclean(yylval.str_val, yytext); return EQUALS; }
"+"            { yylval.str_val = strdupclean(yylval.str_val, yytext); return PLUS; }
"*"            { yylval.str_val = strdupclean(yylval.str_val, yytext); return MULT; }
"-"            { yylval.str_val = strdupclean(yylval.str_val, yytext); return MINUS; }
"/"            { yylval.str_val = strdupclean(yylval.str_val, yytext); return DIV; }
"("            { yylval.str_val = strdupclean(yylval.str_val, yytext); return OPAREN; }
")"            { yylval.str_val = strdupclean(yylval.str_val, yytext); return CPAREN; }
";"            { yylval.str_val = strdupclean(yylval.str_val, yytext); return SEMIC; }
"sqrt"         { yylval.str_val = strdupclean(yylval.str_val, yytext); return SQRT; }
{id}           { yylval.str_val = strdupclean(yylval.str_val, yytext); 
                 /*HINT: One way to simplify parsing is to have lex return what
                  * type of variable we have.  IVAR = int, FVAR = float
                  * UVAR = unknown var.
                  * Naturally, you may use your own solution.
                  */
              if (strcmp(yylval.str_val, "int")) {return IVAR;}
              else if (strcmp(yylval.str_val, "float")) {return FVAR;}
                  else {return UVAR;} 
               }

[ \t]*         {}
[\n]           { yylinenum++;    }

.              { yyerror("Unknown Symbol"); exit(1); }
%%

and this is the yacc file:

%{
#include "calc.h"
#include "symbol-table.h"
int yyerror(char *s);
int yylex(void);
%}

%union{
int         int_val;
float       float_val;
char*       str_val;
}

%start input

%token <int_val>   INTEGER_LITERAL
%token <float_val> FLOAT_LITERAL
%token <float_val> SQRT
%token OPAREN CPAREN SEMIC IVAR FVAR UVAR
%type  <int_val>   int_exp
%type  <float_val> float_exp
%type  <str_val>  IVAR FVAR UVAR
%right EQUALS /*right associative, everything on the right side of the = should be evaluated and stored*/
%left  PLUS MINUS/*The order matters, by listing PLUS/MIUS first and then MULT/DIV we are */
%left  MULT DIV /*telling yacc to evaluate MULTs & DIVs before PLUSes and MINUSes*/

%%

input:           /*empty*/
            | int_exp                   { printf("Result %d\n", $1); }
            | float_exp                 { printf("Result %f\n", $1); }
            | assignment                { printf("Result \n"); }
            ;

int_exp:          INTEGER_LITERAL           { $$ = $1; }
            | int_exp PLUS int_exp      { $$ = $1 + $3; }
            | int_exp MULT int_exp      { $$ = $1 * $3; }
            | int_exp MINUS int_exp     { $$ = $1 - $3; }
            | int_exp DIV int_exp       { $$ = $1 / $3; }
            | OPAREN int_exp CPAREN     { $$ = $2; }
            ;

float_exp:        FLOAT_LITERAL             { $$ = $1; }
            | float_exp PLUS float_exp  { $$ = $1 + $3; }
            | float_exp MULT float_exp  { $$ = $1 * $3; }
            | float_exp MINUS float_exp { $$ = $1 - $3; }
            | float_exp DIV float_exp   { $$ = $1 / $3; }
            | int_exp PLUS float_exp    { $$ = (float)$1 + $3; }
            | int_exp MULT float_exp    { $$ = (float)$1 * $3; }
            | int_exp MINUS float_exp   { $$ = (float)$1 - $3; }
            | int_exp DIV float_exp     { $$ = (float)$1 / $3; }
            | float_exp PLUS int_exp    { $$ = (float)$1 + $3; }
            | float_exp MULT int_exp    { $$ = (float)$1 * $3; }
            | float_exp MINUS int_exp   { $$ = (float)$1 - $3; }
            | float_exp DIV int_exp     { $$ = (float)$1 / $3; }
            | OPAREN float_exp CPAREN   { $$ = $2; }
            | SQRT OPAREN float_exp CPAREN  { $$ = sqrt((double)$3); }
            | SQRT OPAREN int_exp CPAREN    { $$ = sqrt((double)$3); }
            ;

assignment:       UVAR EQUALS float_exp SEMIC           { //if UVAR exists and is float, update value
                                          //if UVAR doesn't exist, error: unknown type
                                          symbol_table_node *n1 = symbol_table_find( $1, *st);
                                          if(n1) { 
                                            if(n1->type == FLOAT_TYPE) {
                                                n1->val.float_val = $3;
                                            } else {
                                                //error
                                            }
                                            //error, variable not defined
                                          //if UVAR is not float, error: illegal assignment
                                          }
                                        }
            | UVAR EQUALS int_exp SEMIC         { 
                                          symbol_table_node *n1 = symbol_table_find( $1, *st);
                                          if(n1) { 
                                            if(n1->type == INT_TYPE) {
                                                n1->val.int_val = $3;
                                            } else {
                                                //error
                                            }
                                          }
                                        }
            | IVAR UVAR EQUALS int_exp SEMIC { //UVAR should not be in symbol table
                                          if(symbol_table_find($2, *st)) {
                                            //error
                                          } else {
                                            //how to handle errors?
                                            symbol_table_add_i($2, $4, *st);
                                          }
                                        }
            | FVAR UVAR EQUALS float_exp SEMIC { 
                                          if(symbol_table_find($2, *st)) {

                                          } else {
                                            symbol_table_add_f($2, $4, *st);
                                          }
                                        }
            ;

%%

int yyerror(char *s){
extern int yylinenum; /* defined and maintained in lex.c*/
extern char *yytext;  /* defined and maintained in lex.c*/

printf("ERROR: %s at symbol \"%s\" on line %d\n", s, yytext, yylinenum);
return -1;
}

المحلول

The bug is simple enough:

if (strcmp(yylval.str_val, "int"))

does not do what you think it does. strcmp returns 0 if the two strings are equal, a negative value if the first one is lexicographically earlier, and a positive value if the first one is lexicographically later. Used as a boolean, that means that strcmp is false if the string compare equal, and true otherwise. So the token int will not generate the token value IVAL.

But I don't think that's really what you wanted to do, anyway. Your instructor's hint about having the lexer return a token type corresponding to the known datatype of the token is referring to looking the variable up in the symbol table and returning a token type corresponding to the declaration of the variable. It's not referring to recognizing the reserved words int and float, which should be done with simple lexer rules, much like your rule for the reserved word sqrt.

As written, your grammar does not allow expressions to use variables, so (even after you fix the bug I referred to), the following will fail:

int b = 0;
int a = b + 3;

because b won't be recognized as an int_exp. It's in this context that the instructor's hint applies. (Although personally, I would suggest doing it differently.)

Finally, I don't know what strdupclean does, but I presume it involves making a copy of yytext. That's almost certainly unnecessary in the case of operator tokens (+, -, and so on) or reserved words, since you will never refer to the "semantic value" of those tokens. (As evidence, you don't declare any of these tokens to even have a semantic type.) Unnecessary copying does have a cost, particularly if you need to clean up the memory allocated for the copy.

Good luck.

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow