Question

Text books everywhere assume that the Bounded Post Correspondence Problem is NP-complete (no more than $N$ indexes allowed with repetitions). However, nowhere is one shown a simple (as in, something that an undergrad can understand) polynomial time reduction from another NP-complete problem.

However every reduction I can think of is exponential (by $N$ or by the size of the series) in run-time. Perhaps it can be shown that it is reducible to SAT?

Was it helpful?

Solution

As is often the case with NP-reductions, it makes sense to look for similar problems. In particular, it is hard to encode global conditions such has "have seen some nodes" into PCP (with polynomially many tiles) which contraindicates graph problems, packing problems would require us to encode unary numbers in PCP (creating exponentially large instance), and so on. Therefore, a string problem with only local restrictions can be expected to work best.

Consider the decision version of the shortest common supersequence problem:

Given two strings $a,b \in \Sigma^+$ with $|a|=n$ and $|b|=m$ and $k \in \mathbb{N}$, decide whether there is a string $c \in \Sigma^+$ with $|c| \leq k$ such that $a$ and $b$ are subsequences of $c$.

The idea is to let PCP build supersequences of $a$ and $b$ from left to right, encoding in the tiles' overlaps at which position we are in $a$ and $b$, respectively. It will use one tile per symbol in $c$, so $k$ corresponds to the BPCP's bound: if we can solve this PCP with $\leq k$ tiles, you can read off the common supersequence of equal length, and vice versa.

The construction of the tiles is a bit tedious, but quite clear. Note that we will not create tiles that do not forward $a$ or $b$; such can never be part of a shortest common supersequence, so they are superfluous. They can easily be added without breaking the properties of the reduction.

The numbers in the overlaps are encoded in binary, but using symbols outside of $\Sigma$ and padding them to a common length $\log \max(m,n)$. Thus we ensure that the tiles are used as the graphics suggest (tetris), that is characters and index-encoding overlaps do not mix (PCP does not prevent this per se). We need:

  • Starting tiles: $c$ can start with $a_1$, $b_1$ or both if they are equal.
  • Intermediate tiles: $c$ can proceed with the next symbol in $a$, in $b$ or both if they are equal.
  • Terminating tiles: $c$ ends with the last symbol of $a$ (if the last one of $b$ has been seen already), similar for $b$, or with the last symbol of both.

These are the tile schematics. Note that the intermediate tiles have to be instantiated for all pairs $(i,j) \in [n]\times [m]$. As mentioned above, create the tiles without $*$ only if the respective characters in $a$ and $b$ match.

enter image description here
[source]

The $*$ are symbolic for "don't care"; in the actual tiles, the other symbol will have to be copied there. Note that the number of tiles is in $\Theta(mn)$ and each tile has length $4\log \max(m,n) + 1$, so the constructed BPCP instance (over alphabet $\Sigma \cup \{0,1\}$ plus separation symbols) has polynomial size. Furthermore, the construction of every tile is clearly possible in polynomial time. Therefore, the proposed reduction is indeed a valid polynomial transformation which reduces the NP-complete shortest common supersequence problem to BPCP.

OTHER TIPS

I think that you can prove that BPCP is NP-complete by using a reduction similar to the one used to prove its undecidability. We will directly prove that BPCP is NP-complete by showing how to reduce any problem in NP to it in polynomial time.

The standard reduction used to prove that PCP is undecidable (sketched out here) works by building a series of tiles such that there is a PCP solution iff there is an accepting computation of a given TM $M$ on a string $w$. The number of tiles created in this reduction is polynomially large - specifically, the number of dominoes constructed is some function of the size of the tape alphabet and number of states in the TM. The only domino whose size may be large is the initial domino, which has $w$ written on it. If we generalize this reduction from working on deterministic TMs to working on nondeterministic TMs, then this introduces at most some constant number of dominoes, since the number of transitions is finite. Consequently, we can construct the standard set of dominoes for the normal undecidability reduction in polynomial time.

Given this, we can reduce any NP problem to BPCP as follows - given any NP problem, it has some polynomial-time NTM $M$ that runs in time $p(n)$. We can then reduce this problem to BPCP in polynomial time as follows - construct the standard set of dominoes from $M$, then ask whether there is a solution that uses $f(p(n))$ dominoes, where $f$ is some polynomial function that expresses the number of dominoes necessary for the solution to exist (this is probably something like $n^2$, and is certainly not exponential). Then, using the same proof you use to show that PCP is undecidable, you can prove that there is a solution to this BPCP instance that uses at most $f(p(n))$ tiles iff the original NTM $M$ accepts $m$ within $p(n)$ steps. Consequently, we have a polynomial-time reduction from every problem in NP to BPCP, so BPCP is NP-hard.

(We should also show that BPCP is in NP, but that's easy; just nondeterministically guess which dominoes to put in order, then deterministically verify it).

Hope this helps!

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top