Question

if I have a grammar having a production that contains both left recursion and left factoring like

$\qquad \displaystyle F \to FBa \mid cDS \mid c$

which one has priority, left recursion or left factoring?

Was it helpful?

Solution

Transformations such as left factoring or removing left recursion do not have precedence rules. Obviously, the resulting grammars may be different but they will recognize the same language.

The question's example grammar is more difficult than the typical undergrad homework problem. So showing our work will be useful.

Left Recursion

Let's define a transformation that removes left recursion.

Given

$\qquad \displaystyle A \to A\alpha_0 \mid A \alpha_1 \mid \dots \mid A \alpha_n \mid B \beta_0 \mid B \beta_1 \mid \dots \mid B\beta_n$,

we remove left recursion like this:

$\qquad \displaystyle \begin{align} A_h &\to B \beta_0 \mid B \beta_1 \mid \dots \mid B \beta_n \\ A_t &\to \alpha_0 \mid \alpha_1 \mid \dots \mid \alpha_n \\ A_{t^+} &\to A_t A_{t^+} \mid A_t \\ A &\to A_h A_{t^+} \mid A_h \end{align}$

The generality of above is typically not given in compiler texts, but parsing texts like Grune & Jacobs do cover this. Left factoring can be applied to the above transformed grammar but will be just introduce extra rules that will not change the answer. So we will simplify the presentation without extra left factoring being performed.

In this answer we will not cover indirect left recursion issues because we are only concerned with a single non-terminal's rules. Note that indirect left recursion can be dealt with, though. (Open a separate question if that is important.)

Left Factoring

Removing left factoring is in most introductory compiler texts done like this. Given

$\qquad \displaystyle A \to x y \mid x z$

left factoring yields:

$\qquad \displaystyle \begin{align} A_s &\to y \mid z \\ A &\to x A_s \end{align}$

Now that's perform the transformations in both ordering.

Left factoring first

Let's left factor the question's grammar

$\qquad \displaystyle \begin{align} F_s &\to D S \mid \varepsilon \\ F &\to F B a \mid c F_s \end{align}$

and then remove the left recursion:

$\qquad \displaystyle \begin{align} F_h &\to c F_s \\ F_t &\to B a \\ F_{t^+} &\to F_t F_{t^+} | F_t \\ F_s &\to D S \mid \varepsilon \\ F &\to F_h F_{t^+} \mid F_h \end{align}$

Removing left-recursion first

And for the other ordering, let's remove left recursion from the question's grammar

$\qquad \displaystyle \begin{align} F_h &\to c D S \mid c \\ F_t &\to B a \\ F_{t^+} &\to F_t F_{t^+} \mid F_t \\ F &\to F_h F_{t^+} \mid F_h \end{align}$

and then left factor the terminal $c$:

$\qquad \displaystyle \begin{align} F_{hs} &\to D S \mid \varepsilon \\ F_h &\to c F_{hs} \\ F_t &\to B a \\ F_{t^+} &\to F_t F_{t^+} \mid F_t \\ F &\to F_h F_{t^+} \mid F_h \end{align}$

Aha, the resulting grammars are the same!

In general, proving that two grammars are equivalent is undecidable. So if a series grammar transformations could influence the language that is recognized then, it would be disastrous.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top