$\mathcal{G} = \{ v_2 v_4 \ldots v_{k} : v_1 v_2 v_3 v_4 \ldots v_{k-1} v_{k} \in \mathcal{L}, \text{ $k$ even} \} $ is context free language

https://cs.stackexchange.com/questions/126068

29-09-2020
|

Question

Let $\mathcal{L}$ be context free language over alphabet $\Sigma$. Define $\mathcal{G}$ as $$\mathcal{G} = \{ v_2 v_4 \ldots v_{k} : v_1 v_2 v_3 v_4 \ldots v_{k-1} v_{k} \in \mathcal{L}, \text{ $k$ even} \} $$

I have seen a similar question (asked 5 years ago) but I am not sure how it can work.

Proposition

$\mathcal{L}$ is CFL so it has own push-down automata. So let copy states of $\mathcal{L}$ and if it has a state called $S$ and it gets to state $T$ upon letter $x$ then $\mathcal{G}$ will have states $S_1, S_2, T_1, T_2$ and letter $x$ turns $S_1$ to $T_2$ and $S_2$ to $T_1$.

My question is why it is correct apporach? $\mathcal{G}$ automata won't read any of $v_1, v_3, v_5,... v_{k-1}$ so how it can ensure that this word belongs to $\mathcal{L}$?

Solution

Let the PDA for the given language $L$ be $P$. Take two copies of states of $P$: $P_1$ and $P_2$. We will join $P_1$ and $P_2$ as follows: if there is transition state $S$ from $T$ on reading $x$ pushing/popping $Y$, then add an $\epsilon$-transition from $S_1$ to $T_2$ pushing/popping $Y$, and add a transition from $S_2$ to $T_1$ on reading $x$ pushing/popping $Y$. The initial states will be in $P_1$ and the final states in $P_2$.

The idea is: We start from an initial state in $P_1$. We have to non-deterministically take an $\epsilon$-transition (because only those are present when we are at a state in $P_1$): this will correspond to reading $v_1$. Then, when we are in a state in $P_2$, we will read $v_2$ and move to an appropriate state in $P_1$ allowed by the transitions. This will make sure that we guess a letter $v_i$ before reading any letter $v_{i+1}$ from the input for all $i$ from $1$ to $k$.
This will make sure that $v_1v_2\ldots$ is in $L$. Hence, even though we are only reading letters at even positions, we are guessing the letters at odd positions such that the whole word should have been accepted by $P$.

You can try formally proving this.

OTHER TIPS

This answer assumes that $v_i \in \Sigma$ are individual symbols.

You can prove this using closure properties. The advantage is that any class of languages closed under the requisite closure properties will be closed under this operation. Specifically, we will need closure under homomorphism, reverse homomorphism, and intersection with regular language, which are exactly the so-called "full trio".

Let $\Sigma' = \{ \sigma' : \sigma \in \Sigma \}$ be a copy of $\Sigma$. Define homomorphisms $r,d\colon \Sigma \cup \Sigma' \to \Sigma$ by $r(\sigma) = r(\sigma') = \sigma$ and $d(\sigma) = \sigma$, $d(\sigma') = \epsilon$. Then $$ \mathcal{G} = d(r^{-1}(\mathcal{L}) \cap (\Sigma' \Sigma)^*). $$

Some families of languages, for example context-sensitive languages, are closed under the so-called "trio", in which homomorphism is replaced with $\epsilon$-free homomorphism (meaning $h(\sigma) \neq \epsilon$ for all letters $\sigma$). We can accommodate these as well with a slightly more complicated argument.

Let $e\colon \Sigma' \times \Sigma \to \Sigma \cup \Sigma$ be given by $e(\sigma',\sigma) = \sigma' \sigma$, and let $p\colon \Sigma' \times \Sigma \to \Sigma$ be given by $p(\sigma',\sigma) = \sigma$. Then $$ \mathcal{G} = p(e^{-1}(r^{-1}(\mathcal{L}) \cap (\Sigma'\Sigma)^*)). $$

The other answers use pushdown automata and closure properties. Lets try a solution with context-free grammars.

We may assume that $L$ has a context-free grammar in Chomsky normalform. Which means its productions are of the form $A\to BC$ and $A\to a$, with $A,B,C$ nonterminals, and $a$ terminal (in $\Sigma$).

We will build a grammar for the derived language $G$, where every other symbol is skipped. The new grammar will have nonterminals that keep track of the even/odd position of the next terminal in the string.

For each nonterminal $X$ we introduce four nonterminals $[i,X,j]$ with $i,j$ either $0$ or $1$.

For each production $A\to BC$ we introduce eight productions $[i,A,k] \to [i,B,j] [j,C,k]$

For earch terminal production $A\to a$ we introduce two productions $[0,A,1] \to \varepsilon$ and $[1,A,0] \to a$. (These productions toggle the parity of the symbol.)

The axiom for the new grammar is $[0,S,0]$ where $S$ is the original axiom.

This construction actually works that same way as one that can be used to prove that context-free languages are closed under intersection with regular languages. Usually that is shown using pushdown automata, but it can be done with context-free grammars.

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange