Question

I would like to ask for intuition to understand the difference between a CFG generating $\Sigma^*$ and a regular grammar generating $\Sigma^*$.. I got the examples here from Sipser. Let $ALL_{CFG}$ refer to the language that a given CFG generates $\Sigma^*$, and let $ALL_{REX}$ refer to the language that a given regular expression generates $\Sigma^*$ (and since for each regular expression there is a regular grammar, we can also say that the equivalent regular grammar generates $\Sigma^*$).

From what I got, we have:

  • $ALL_{CFG}$ is not decidable, it is also not Turing-recognizable. Let $\overline{A_{TM}}$ refer to the language that a TM $M$ does not accept input word $w$. We can reduce $\overline{A_{TM}}$ to $ALL_{CFG}$ in polynomial time using computation histories. The reduction constructs a CFG which generates all possible words where: 1) the first characters do not match $w$, 2) the last characters do not match accepting configurations, and 3) characters do not match valid transitions of $M$'s configurations. Thus, $A_{TM}$ does not accept $w$ iff the CFG generates $\Sigma^*$ (i.e. there are no accepting computation histories). Since the reduction maps $\overline{A_{TM}}$ to $ALL_{CFG}$, and $\overline{A_{TM}}$ is not Turing-recognizable, $ALL_{CFG}$ is not Turing-recognizable.

  • $ALL_{REX}$ is decidable since it is decidable if a finite automaton accepts $\Sigma^*$. However, the acceptance problem for any regular language $R$ can be polynomially reduced to the language $ALL_{REX} - f(R_M)$, where $R_M$ is a TM that decides $R$, and $f(R_M)$ is a similar reduction of computation histories as outlined above. In more detail, $f(R_M)$ is the regular grammar that generates all possible words where 1) the first characters do not match $w$, 2) the last characters do not match rejecting configurations, and 3) characters do not match valid transitions of $R_M$'s configurations. The decider for $ALL_{REX} - f(R_M)$ checks if it is empty (which means that $f(R_M)$ is equal to $\Sigma^*$). If empty, then there are no rejecting computation histories and $w \in R$. (In Sipser, he used something like this to show EXPSPACE-completeness for $ALL_{REX} - f(R_M)$)

I would like to ask:

From above, both regular grammars and CFG could generate computation histories of a TM, and both could generate $\Sigma^*$. But what is it with the fundamental power of the CFG's grammar that makes it valid to reduce $\overline{A_{TM}}$ to $ALL_{CFG}$, but it is not possible for $\overline{A_{TM}}$ to be reduced to $ALL_{REX} - f(A_{TM})$ ? I know that we cannot reduce $\overline{A_{TM}}$ to $ALL_{REX} - f(A_{TM})$ since $ALL_{REX} - f(A_{TM})$ is decidable, while $\overline{A_{TM}}$ is not Turing-recognizable... But I would like to understand this in terms of the difference in generating power between CFG's and regular grammars via their rules.

For instance, from what I read, CFG's permit the rules $A \rightarrow BC$, where $B$ and $C$ are strings of variables and terminals. On the other hand, regular grammars only permit rules in the form of $A \rightarrow aB$, where $a$ is a terminal. I would like to ask: why does incorporating rules of the form $A \rightarrow BC$ to a grammar, give it enough generating power such that it is already valid to reduce $\overline{A_{TM}}$ to the grammar (i.e. to the CFG).

Was it helpful?

Solution

Your summary of the proof of undecidability is not accurate. Your specification of $\overline{A_{TM}}$ is not correct.

For a reasonable exposition of the proof, see https://liacs.leidenuniv.nl/~hoogeboomhj/second/codingcomputations.pdf particularly the beginning of Section 1 and Section 3.

The intuition is not easy to convey, as the proof is not entirely trivial. But here is the core fact. Let $v,w$ be two configurations of a Turing machine. Write $n(v)$ to be the next configuration of the Turing machine after a single step of computation, if you start at configuration $v$. Define the language

$$L = \{v \# w^R \mid n(v) \ne w\}.$$

Then the key fact is that $L$ is context-free. This takes some proof; proving that is a key step in the proof. However, that's the answer to your question: $L$ is context-free but not regular. As a result, we can reduce the halting problem to $ALL_{CFG}$ but not to $ALL_{REX}$.

I have skipped many steps to give you an overview of the main idea. You will need to read the full proof to fill in the details. I suggest that you next read and understand the proof, with this perspective in mind, and then revisit what I've written here. Hopefully that will help you understand why the proof holds for context-free languages but would fail for regular languages.

OTHER TIPS

The difference between the models stems, intuitively, from the ability of CFGs to count. More precisely, CFGs are able to generate strings of the form $a^nb^n$, where the number of $a$'s and $b$'s is the same.

This ability grants it the power to compare strings, which can then be utilized to show undecidability, because the CFG is able to compare the contents of a tape between two consecutive configurations.

This becomes a bit more evident if you recall that the halting problem for two-counter machines (Minsky machines) is undecidable. There, a configuration is given by the values of two counters. You can this of this as a TM with a sort of unary alphabet (although not exactly). In two counter machines, comparing two consecutive configurations exactly amounts to comparing the values of the counters in consecutive steps, which is right up the alley for CFGs.

In contrast, Regular languages are captured by finite state automata, which have finite memory, and are able to count only up to a fixed number. Thus, these automata can simulate a TM as long as the length of a configuration is bounded in advance. Why does this give us PSPACE hardness? Well, you can simulate any TM that works in bounded space, it doesn't have to be polynomial in the input. However, in order for the reduction itself to be polynomial, you need the bound to be polynomial. Thus, you get exactly PSPACE hardness.

With regards to the "type" of rules, it's not so much the $A\to BC$ rules that are a problem, it's more in rules of the form $A\to aAb$ (or more generally, the ability to have cyclic rules). The reason is that $A\to BC$ has a "tree" structure, and if $B$ and $C$ are not related later, then this is effectively a union operation, which regular languages can simulate.

However, a rule of the form $A\to aAb$ maintains the "context" $A$, which is something regular languages cannot do.

To summarize:

Semantically, the power of CFGs lies in their ability to count.

Syntactically, the power of CFGs lies in cyclic rules.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top