Question

In Needleman-Wunsch and Smith-Waterman, what is the best way to implement traceback? Do we usually keep two matrices, one with each entry's predecessor? That is, each entry would be UP, DIAG, or LEFT. Or is there a simpler, more space-efficient way to do traceback? I understand the algorithms and how to get the maximum score, but not the traceback. Thanks!

Was it helpful?

Solution

Using 2 matrices will work but is the naive approach especially if size or memory are an issue. The problem is that 2 separate matrices are space inefficient.

Since there are only 3 possible directions for the trace back in N-W and 4 possible in S-W (you need to add the STOP), you can store each direction as 2 bits. If your scores will be small enough, you can pack both values of the corresponding matrix cells into a single cell of one matrix and then do bit masking to get the score and the traceback direction.

Or, if you still want 2 matrices, there is no reason to take up so much space for the traceback matrix. You can still pack your traceback matrix so that 4 traceback positions are in a single cell of a byte matrix. (you would have to similar bit masking).

OTHER TIPS

My understanding is yes, you do need 2 matrices.

And then you trace back from the bottom right position. See http://en.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorithm

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top