Question

I've been trying to parse concatenated strings such that expressions can also be concatenated to form the string. That is,

"No, " + 4*(6+5)/(8-4) + " is not equal to " + 75*1.3 + "."

The above should be parsed correctly. The problem is that + leads to shift-reduce conflicts. I've been using the following grammar;

<S> ::= <A> '+' <S>
      | <A>
<A> ::= <E>
       |QUOT
<E> ::= <T> '+' <E>
      | <T> '-' <E>
      | <T>
<T> ::= <F> '*' <T>
      | <F> '/' <T>
      | <F>
<F> ::= NUM
      | '(' <E> ')'

I've haven't had any success in trying to find a grammar where + does not cause a shift-reduce conflict. I'm hoping that there is a way to make this grammar LALR, and I'll really appreciate some help in trying to find it.

Was it helpful?

Solution

If:

  1. You don't care about how either + operator (addition or concatenation) associates; and
  2. You don't need to be able to use parentheses around string expressions,

then there is a solution. I regard it more as an intellectual exercise than a practical solution, partly because the above requirements are highly restrictive, and partly because I don't think you should use the same operator with two different precedences. (In fact, I'm not a big supporter of using + for concatenation. It's different enough that it deserves its own symbol.)

The following solution works pretty hard to make sure that string expressions are distinguishable from arithmetic expressions, which requires that string expressions cannot start with a (. In order to avoid premature reduction of concatenation, it makes the concatenation version of + associate to the right, while the addition version associates to the left as normal.

S : E '+' A
  | E
  | A
  ;

A : QUOT '+' S
  | QUOT
  ;

E : E '+' T
  | E '-' T
  | T
  ;

T : T '*' F
  | T '/' F
  | F
  ;

F : NUM
  | '(' E ')'
  ;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top