Question

I am new to Lex and Yacc and I am trying to create a parser for a simple language which allows for basic arithmetic and equality expressions. Though I have some of it working, I am encountering errors when trying to parse expressions involving binary operations. Here is my .y file:

%{
   #include <stdlib.h>
   #include <stdio.h>
%}

%token  NUMBER
%token  HOME
%token  PU
%token  PD
%token  FD
%token  BK
%token  RT
%token  LT

%left '+' '-'
%left '=' '<' '>'
%nonassoc UMINUS


%%

S       :       statement S                 { printf("S -> stmt S\n"); }
        |                                   { printf("S -> \n"); }
;

statement :     HOME                        { printf("stmt -> HOME\n"); }
        |       PD                          { printf("stmt -> PD\n"); }
        |       PU                          { printf("stmt -> PU\n"); }
        |       FD expression               { printf("stmt -> FD expr\n"); }
        |       BK expression               { printf("stmt -> BK expr\n"); }
        |       RT expression               { printf("stmt -> RT expr\n"); }
        |       LT expression               { printf("stmt -> LT expr\n"); }
;

expression :    expression '+' expression   { printf("expr -> expr + expr\n"); }
         |      expression '-' expression   { printf("expr -> expr - expr\n"); }
         |      expression '>' expression   { printf("expr -> expr > expr\n"); }
         |      expression '<' expression   { printf("expr -> expr < expr\n"); }
         |      expression '=' expression   { printf("expr -> expr = expr\n"); }
         |      '(' expression ')'          { printf("expr -> (expr)\n"); }
         |      '-' expression %prec UMINUS { printf("expr -> -expr\n"); }
         |      NUMBER                      { printf("expr -> number\n"); }
;

%%

int yyerror(char *s)
{
   fprintf (stderr, "%s\n", s);
   return 0;
}

int main()
{
   yyparse();
}

And here is my .l file for Lex:

%{
   #include "testYacc.h"
%}

number [0-9]+

%%
[ ]             { /* skip blanks */ }
{number}        { sscanf(yytext, "%d", &yylval); return NUMBER; }
home            { return HOME; }
pu              { return PU; }
pd              { return PD; }
fd              { return FD; }
bk              { return BK; }
rt              { return RT; }
lt              { return LT; }

%%

When I try to enter an arithmetic expression on the command-line for evaluation, it results in the following error:

home
stmt -> HOME

pu
stmt -> PU

fd 10
expr -> number

fd 10
stmt -> FD expr
expr -> number

fd (10 + 10)
stmt -> FD expr
(expr -> number
+stmt -> FD expr
S ->
S -> stmt S
S -> stmt S
S -> stmt S
S -> stmt S
S -> stmt S
syntax error
Was it helpful?

Solution

Your lexer lacks rules to match and return tokens such as '+' and '*', so if there are any in your input, it will just echo them and discard them. This is what happens when you enter fd (10 + 10) -- the lexer returns the tokens FD NUMBER NUMBER while + and ( get echoed to stdout. The parser then gives a syntax error.

You want to add a rule to return these single character tokens. The easiest is to just add a single rule to your .l file at the end:

.               { return *yytext; }

which matches any single character.

Note that this does NOT match a \n (newline), so newlines in your input will still be echoed and ignored. You might want to add them (and tabs and carriage returns) to your skip blanks rule:

[ \t\r\n]       { /* skip blanks */ }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top