Longest common subsequence with fixed length substrings

Question

The correction is basically when the two strings in the relevant index have a matching substring (instead of a matching letter, as it is now).

The idea is instead of simply checking for a substring of size 1 in the original solution, check for a substring of length k, and add 1 to the solution (and 'jump' by k in the string).

The formula for the recursive approach that should be translated to the DP solution is:

f(i,0) = 0
f(0,j) = 0
f(i,j) = max{
          f(i-k,j-k) + aux(s1,s2,i,j,k)
          f(i-1,j)
          f(i,j-1)
             }

where aux(s1,s2,i,j,k) is a function that is aimed to check if the two substrings are a match, and is defined as:

aux(s1,s2,i,j,k) = 1         | if s1.substring(i-k,i) equals s2.substring(j-k, j)
                   -infinity | otherwise

You can reconstruct the alignment later using an auxilary matrix that marks the choices of max{}, and go from last to first when the matrix is complete.

Example:

bcbac and cbcba. k=2

Matrix generated by f:

      c   b   c  b   a
   0  0   0   0  0   0
b  0  0   0   0  0   0
c  0  0   0   1  1   1   
b  0  0   1   1  1   1
a  0  0   1   1  1   2
c  0  0   1   1  1   2

And for reproducing the alignment you generate 'choices' matrix:

1 - chose f(i-k,j-k) + aux(s1,s2,i,j,k)
2 - chose f(i-1,j)
3 - chose f(i,j-1)
d - don't care, (all choices are fine)
x/y -means one of x or y.

c      b      c     b      a

b   2/3    2/3    2/3   2/3    2/3
c   2/3    2/3     1     2      2   
b   2/3     3     2/3    d     2/3
a   2/3     3     2/3   2/3     1
c   2/3     3     2/3   2/3     3

Now, reconstructing the alignment - start from the last (bottom right) cell:

It's '3' - move up, don't add anything to the alignment.
It's 1 - we need to add 'ba' to the alignment. (alignment='ba' currently). move k up and left.
It's 1, ad 'bc' to the alignment. current alignment: 'bcba'. move k up and left.
It's 2/3 - move left OR up.

The order of visiting while reconstructing is: (0 means - not visited, number in many cells means any of these is OK).

      c   b   c  b   a
   0  4   0   0  0   0
b  4  3   0   0  0   0
c  0  0   0   0  0   0   
b  0  0   0   2  0   0
a  0  0   0   0  0   0
c  0  0   0   0  0   1