String to string correction problem np-completeness proof

https://stackoverflow.com/questions/6037481

14-11-2019
|

Question

I have this assignment to prove that this problem:

Finite alphabet £, two strings x,y € £*, and a positive integer K. Is there a way to derive the string y from the string x by a sequence of K or fewer operations of single symbol deletion or adjacent symbol interchange?

is np-complete. I already figured out I have to make transformation from decision version of set covering problem, but I have no clue how to do this. Any help would be appreciated.

Solution

~~It looks like modified Levenshtein distance. Problem can be solved with DP in quadratic time.~~

Transformation from minimum set cover (MSC) to this string correction problem is described in:

Robert A. Wagner
On the complexity of the Extended String-to-String Correction Problem
1975, Proceedings of seventh annual ACM symposium on Theory of computing

In short with MSC problem:

Given finite sets x_1, ..., x_n, and integer L, does there exists a subset J of {1,...,n} such that |J| <= L, and

union_{j in J} x_j = union all x_i ?

Let w = union all x_i, let t = |w| and r = t^2, and choose symbols Q, R, S not in w.

Take strings:

A = Q^r R x_1 Q^r S^(r+1) ... Q^r R x_n Q^r S^(r+1)
B = R Q^r ... R Q^r w S^(r+1) ... S^(r+1)   <- each ... is n times
and
k = (l+1)r - 1 + 2t(r+1)(n-1) + n(n-1)(r+1)^2/2 + (r*n + |x_1 ... x_n| - t)*W_d
[W_d is delete operation weight, can be 1.]

It is shown that string-string correction problems (A,B,k) is satisfiable iff source MSC problem is.

From strings construction it is clear that proof is not trivial :-) But it isn't too complex to manage.

OTHER TIPS

The NP-hardness proof that is mentioned only works for arbitrarily large alphabets. For finite alphabets, the problem is polynomial-time, see https://dblp.org/rec/bibtex/journals/tcs/Meister15

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow