Question

I've been working on a calculator in Flex and Bison. I can assign variables individually such as

m = 5
m * 5
25

but now I want to be able to assign multiple variables such as m = n = 5

Here's what I have for my bison file right now. How would I modify it to the above?

%{
 #include <stdio.h>
 void yyerror(char *);
 int yylex(void);
 int sym[26];
%}
%token INTEGER VARIABLE
%left '+' '-' '*'
%%
program:
 program statement '\n'
 | /* NULL */
 ;
statement:
 expression { printf("%d\n", $1); }
 | VARIABLE '=' expression { sym[$1] = $3; }
 ;
expression:
 INTEGER
 | VARIABLE { $$ = sym[$1]; }
 | '-' expression { $$ = -$2; }
 | expression '+' expression { $$ = $1 + $3; }
 | expression '-' expression { $$ = $1 - $3; }
 | expression '*' expression { $$ = $1 * $3; }
 | '(' expression ')' { $$ = $2; }
 ;
%%
void yyerror(char *s) {
 fprintf(stderr, "%s\n", s);
}
int main(void) {
 yyparse();
}
Was it helpful?

Solution

Three quick observations:

1) You're setting +, - and * to have the same precedence, which means that

 1 + 2 * 3

will be parsed as though it were

 (1 + 2) * 3   (9)

whereas the normal interpretation would be

 1 + (2 * 3)   (7)

2) Really, assignment can be considered just another case of an expression, where the = operator has lower precedence than any other operator, and associates to the right.

3) Finally, you should set a precedence for unary minus. See the bison manual for more explanation.

Making these changes results in the following simple grammar:

%token INTEGER VARIABLE
%right '='
%left '+' '-'
%left '*'
%right UNOP
%%

program:
 program statement '\n'
 | /* NULL */
 ;
statement:
 expression { printf("%d\n", $1); }
 | /* EMPTY; added to allow empty lines */
 ;
expression:
 INTEGER
 | VARIABLE                  { $$ = sym[$1]; }
 | VARIABLE '=' expression   { $$ = sym[$1] = $3; }  /* CHANGED; return result */
 | '-' expression %prec UNOP { $$ = -$2; }           /* CHANGED; add precedence */
 | expression '+' expression { $$ = $1 + $3; }
 | expression '-' expression { $$ = $1 - $3; }
 | expression '*' expression { $$ = $1 * $3; }
 | '(' expression ')'        { $$ = $2; }
 ;

Note that expression is used in both left- and right-recursive contexts in the above rules. That's normal. Notwithstanding the rather alarmist statement in the bison manual about right-recursion, it's actually totally reasonable to write right-recursive rules when that's what the semantics of the language indicates. (If there is no semantic difference, prefer left-recursion to avoid using the parser stack, but using the parser stack is rarely an issue.)

As Chris Dodd points out in a comment, the above simplified grammar alters the semantics of the program because it will print out the value of an assignment expression; in the original, assignment statement were mute. This is easy to fix, but slightly tedious. I didn't do it in the above answer because it distracts from the structure of the solution.

The simplest fix is to duplicate the assignment production, both as part of statement and of expression; this results in a reduce/reduce conflict which bison will warn you about, but since bison prefers the reduction earlier in the file, it will produce the correct result. Many people (including me) would consider relying on that rule in production code to be ugly and hard to maintain.

Another option is to separate expressions into "quiet" and "loud" expressions, which have different unit productions for statement. That would look somewhat more like the original grammar, with precedences fixed.

Yet another option would be to build an Abstract Syntax Tree (AST) rather than immediately executing the program. When the AST is executed, you can easily suppress printing the result if the top node is an assignment. (Unlike the quiet/loud solution, a naive implementation of the AST would suppress printing for both a simple assignment, a = 3, and an obvious expression assignment (a = 3), but you can work around that, too.)

OTHER TIPS

This is very easy to do with right-recursion, which unfortunately isn't very efficient in bison:

statement:
  expression {}
  | assignment {}
  ;
assigment:
  VARIABLE '=' assignment {}
  | VARIABLE '=' expression {}
  ;

Alternatively, a left-recursion solution:

statement:
  expression {}
  | assignment {}
  ;
assigment:
  recursive_assignment '=' expression {}
  ;
recursive_assignment:
  recursive_assignment '=' VARIABLE {}
  | VARIABLE  {}
  ;

Here is some useful information on recursion in bison.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top