The correction is basically when the two strings in the relevant index have a matching substring (instead of a matching letter, as it is now).
The idea is instead of simply checking for a substring of size 1 in the original solution, check for a substring of length k
, and add 1 to the solution (and 'jump' by k
in the string).
The formula for the recursive approach that should be translated to the DP solution is:
f(i,0) = 0
f(0,j) = 0
f(i,j) = max{
f(i-k,j-k) + aux(s1,s2,i,j,k)
f(i-1,j)
f(i,j-1)
}
where aux(s1,s2,i,j,k)
is a function that is aimed to check if the two substrings are a match, and is defined as:
aux(s1,s2,i,j,k) = 1 | if s1.substring(i-k,i) equals s2.substring(j-k, j)
-infinity | otherwise
You can reconstruct the alignment later using an auxilary matrix that marks the choices of max{}
, and go from last to first when the matrix is complete.
Example:
bcbac
and cbcba
. k=2
Matrix generated by f
:
c b c b a
0 0 0 0 0 0
b 0 0 0 0 0 0
c 0 0 0 1 1 1
b 0 0 1 1 1 1
a 0 0 1 1 1 2
c 0 0 1 1 1 2
And for reproducing the alignment you generate 'choices' matrix:
1 - chose f(i-k,j-k) + aux(s1,s2,i,j,k)
2 - chose f(i-1,j)
3 - chose f(i,j-1)
d - don't care, (all choices are fine)
x/y -means one of x or y.
c b c b a
b 2/3 2/3 2/3 2/3 2/3
c 2/3 2/3 1 2 2
b 2/3 3 2/3 d 2/3
a 2/3 3 2/3 2/3 1
c 2/3 3 2/3 2/3 3
Now, reconstructing the alignment - start from the last (bottom right) cell:
- It's '3' - move up, don't add anything to the alignment.
- It's 1 - we need to add 'ba' to the alignment. (alignment='ba' currently). move k up and left.
- It's 1, ad 'bc' to the alignment. current alignment: 'bcba'. move k up and left.
- It's 2/3 - move left OR up.
The order of visiting while reconstructing is: (0 means - not visited, number in many cells means any of these is OK).
c b c b a
0 4 0 0 0 0
b 4 3 0 0 0 0
c 0 0 0 0 0 0
b 0 0 0 2 0 0
a 0 0 0 0 0 0
c 0 0 0 0 0 1